Word Count Using AWK

I have a file as shown below:

this is an example file this file will be used for testing

this is a sample file this file will be used for testing 

I want to count words using AWK.

Expected Result

 this 2 is 1 a 1 sample 1 file 2 will 1 be 1 used 1 for 1 

below AWK I wrote, but getting some errors

 cat anyfile.txt|awk -F" "'{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' 
+5
source share
3 answers

This works fine for me:

 awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' testfile used 1 this 2 be 1 a 1 for 1 testing 1 file 2 will 1 sample 1 is 1 

PS you do not need to set -F" " , because by default it has an empty space.
PS2, do not use cat with programs that can read data themselves, such as awk

You can add sort after the code to sort it.

 awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' testfile | sort -k 2 -n a 1 be 1 for 1 is 1 sample 1 testing 1 used 1 will 1 file 2 this 2 
+6
source

Instead of iterating over each line and storing the word in an array ( {for(i=1;i<=NF;i++) a[$i]++} ), use gawk with multi-char RS support ( R ecord S eparator ) and save each field in the array as follows: (A bit fast):

 gawk '{a[$0]++} END{for (k in a) print k,a[k]}' RS='[[:space:]]+' file 

Conclusion:

 used 1 this 2 be 1 a 1 for 1 testing 1 file 2 will 1 sample 1 is 1 

In the gawk command above, I define a space-character class [[:space:]]+ (including one or more spaces or the \n ew character) as a record separator.

+1
source

Here is the Perl code that provides similar sorted output for the Jotne awk solution:

perl -ne 'for (split /\s+/, $_){ $w{$_}++ }; END{ for $key (sort keys %w) { print "$key $w{$key}\n"}}' testfile

$_ - the current line, which is divided based on spaces /\s+/
Each word is then placed in $_
The %w hash stores the number of occurrences of each word.
After processing the entire file, the END{} block is executed END{} The keys of the %w hash are sorted alphabetically
Each word $key and the number of occurrences of $w{$key} printed

-1
source

Source: https://habr.com/ru/post/1213806/


All Articles