Use grep to count the number of repetitions of a word in a file.

The problem is this:

For example, I have a file "a.xml". Inside this file, it's just one line like

<queue><item><cause><item>

I want to find how many times <item>happens, in which case it is 2.

However, if I run:

grep -c "<item>" a.xml 

This will give me only 1, because grep stops as soon as it matches the first <item>.

So my problem is how to use a simple shell / bash command that returns the number of times <item>?

It looks simple, but I just can't find a good way. Any ideas?

+4
source share
3 answers

You can try something like:

grep -o "<item>" a.xml | wc -l
+8

awk, :

awk -F '<item>' '{print NF-1}' a.xml

-: http://ideone.com/vheDgq

:

awk -F '<item>' '{s+=NF-1}END{print s}' a.xml
+3

If you just want to count '<item>' alone, then I like the MillaresRoo solution grep -o. If you want to more accurately count the elements, then consider:

$ sed 's/></>\n</g' a.xml | sort | uniq -c
      1 <cause>
      2 <item>
      1 <queue>

Or, by explicitly specifying the input on the command line:

$ echo '<queue><item><cause><item>' | sed 's/></>\n</g' | sort | uniq -c
      1 <cause>
      2 <item>
      1 <queue>
+3
source

All Articles