AWK vs MySQL for data aggregation

Question

AWK vs MySQL for data aggregation

When trying to find out whether AWK or MySQL is more efficient for processing log files and returning statistics, I noticed the following behavior, which does not make sense to me:

To test this, I used a file with 4 columns and approximately 9 million records. I used the same server, which is a VPS with SSD and 1 GB of RAM.

column1 is a column that has about 10 unique values, and the number of complete unique values for the combination of all columns is approximately 4k.

In MySQL, I use a table defined as a table (column1, column2, column3, column4) without indexes.

Data format:

    column1, column2, column3, column4
    column1, column2, column3, column4

AWK Script:

BEGIN {
    FS = ",";
    time = systime();
}  {
    array[$1]++;  #first test
    #array[$1 "," $2 "," $3 "," $4]++; #second test
}
} END {
    for (value in array) {
            print "array[" value "]=" array[value];
    }
}

MySQL query:

Query 1: SELECT column1, count(*) FROM log_test GROUP BY column1;

Query 2: SELECT column1, column2, column3, column4, count(*) 
FROM log_test GROUP BY column1, column2, column3, column4;

AWK , MySQL. , , 10 , MySQL 7 , AWK 22 .

, awk , , , , 4k , AWK , , , , . AWK 90 , .1% MEM, MySQL 45 3% MEM.

AWK 2, 1, ?
AWK awk ?
MySQL , ?
?

+4

performance database mysql awk

Ballard 18 . '13 21:36

2

; , .

, Awk . , , , .

, :

[$ 1] ++

10 $1, 20 ( MYSQL). 20 . 9 10 , 20 , , "" .

:

[$ 1 "," $2 "," $3 "," $4] ++

80 , . 20 .

, 4000 , , 9 4000 80 .

, Awk /- ( , - , / ), , 10 4000 .

, AWK. 5 20 , 4 .

, , AWK MYSQL , MYSQL. , AWK MYSQL , , , MYSQL .

MYSQL , , QUERY, , .

0

DavidG 05 . '13 18:43

SheetJS · Accepted Answer · 2013-10-18T21:49:07+0000

Awk ( ). , 2- 3-

, , ? Force awk, ( ),

MySQL , . , , , , awk ( MySQL, char (10) MySQL ).

, , . , , C, ( )

AWK vs MySQL for data aggregation

More articles: