Using awk to count the number of occurrences of a word in a column

Question

Using awk to count the number of occurrences of a word in a column

03/03/2014 12:31:21 BLOCK 10.1.34.1 11:22:33:44:55:66 03/03/2014 12:31:22 ALLOW 10.1.34.2 AA:BB:CC:DD:EE:FF 03/03/2014 12:31:25 BLOCK 10.1.34.1 55:66:77:88:99:AA

I am trying to use awk to count the number of occurrences of the words “block” and “access” above in a single command.

At first I tried the word "block", but my counter does not work. Can anyone see where my code is wrong?

 awk ' BEGIN {count=0;} { if ($3 == "BLOCK") count+=1} end {print $count}' firewall.log

+15

linux bash awk

user3578872 Jan 16 '15 at 2:43

source share

6 answers

glenn jackman · Answer 1 · 2015-01-16T14:47:23+0000

Use array

 awk '{count[$3]++} END {for (word in count) print word, count[word]}' file

If you want to "block" specifically: END {print count["BLOCK"]}

David thornton · Answer 2 · 2016-12-06T20:09:07+0000

Here is a non-c ode solution. You can bind steps along with pipes ("|").

 awk '{print $3}' file | sort | uniq -c

awk '{print $ 3}'
print the 3rd column, the default entry separator in awk is a space.
sort
sort results
uniq -c
count the number of repetitions

user4453924 · Answer 3 · 2015-01-16T14:59:08+0000

The reason your code may not work is because END case sensitive, so your script will check for the END variable to exist (which is not the case), and therefore the last block will never be executed. If you change this, then it should work.

Also, you do not need a BEGIN block, since the whole variable is created at 0.

Below I have added an alternative way to do this, which you can use instead.

This is similar to glenn, but it only captures the words you want, because of this it should use a little memory.

Using Gawk (for third match argument)

 awk 'match($3,/BLOCK|ALLOW/,b){a[b[0]]++}END{for(i in a)print i ,a[i]}' file

This block is executed only if BLOCK or ALLOW contained in the third field.
The match captures what was matched with array b.
Then array a increases for the matched field.

In the END block, each captured field is displayed with an input counter.

Output signal

 ALLOW 1 BLOCK 2

psoo · Answer 4 · 2017-08-23T16:36:13+0000

I checked your expression

 awk ' BEGIN {count=0;} { if ($3 == "BLOCK") count+=1} end {print $count}' firewall.log

and was able to successfully count BLOCK by doing two changes

end should be in the header
remove $ from print $count

So this should be:

 awk ' BEGIN {count=0;} { if ($3 == "BLOCK") count+=1} END {print count}' firewall.log

A simpler statement that also works:

 awk '($3 == "BLOCK") {count++ } END { print count }' firewall.log

twalberg · Answer 5 · 2015-01-16T16:35:40+0000

The error in your awk call is that you have print $count in your "END" block. This takes the contents of the count variable, assumes that it is an integer, and tries to find the corresponding field in the last line of input. What you really need is just print count , since it just prints the value in the count variable. Sometimes it’s easy to mix different variable binding schemes between bash , awk , python , etc., so it’s easy to do this.

bormarek · Answer 6 · 2019-10-15T09:52:21+0000

I have something similar -

i ask gitlab about list of merge requests

curl -Ss -k --header "PRIVATE-TOKEN: $ at" " https: // gitlab / api / v4 / projects / 111 / merge_requests? state = $ 1 & create_after = $ date & target_branch = $ branch & per_page = 100 & page = 1 "| jq -r '. [] | "(.iid) \ t (.author.username)"

and I have a list of such: output:

11039 user7 11038 user6 11037 user5 11036 user4 11035 user1 11034 user3 11033 user2 11032 user1

How to calculate how many merge requests each user raises. How to calculate how many requests user1 grows, how many user2, etc.

when I make this curl as a variable: Request = curl -Ss -k --header "PRIVATE-TOKEN: $at" "https://gitlab/api/v4/projects/111/merge_requests?state=$1&created_after=$date&target_branch=$branch&per_page=100&page=1"| jq -r '.[] | "\(.iid)\t\(.author.username)" curl -Ss -k --header "PRIVATE-TOKEN: $at" "https://gitlab/api/v4/projects/111/merge_requests?state=$1&created_after=$date&target_branch=$branch&per_page=100&page=1"| jq -r '.[] | "\(.iid)\t\(.author.username)"

and print it as:

  echo "list of $1 requests rise today" echo "$request" echo echo "--------stats--------------" echo "\n$request" | awk '/^[0-9]/{a[$2]++}END{for (i in a) print i, a[i]}' echo "---------------------------" echo

this awk command does not show the correct math for some options. Is there an easier option?

Thanks for the help.

Using awk to count the number of occurrences of a word in a column

More articles: