What is the best way to calculate search results?

Question

What is the best way to calculate search results?

My current solution would be find <expr> -exec printf '.' \; | wc -c find <expr> -exec printf '.' \; | wc -c find <expr> -exec printf '.' \; | wc -c , but it takes too much time when the results are more than 10000. Is there a faster / better way to do this?

+83

bash find

MechMK1 Mar 27 '13 at 16:07

source share

5 answers

Why not

 find <expr> | wc -l

how simple is the portable solution? Your original solution spawns a new printf process for each file found, and it is very expensive (as you just found).

Please note that this will exceed if you have filenames with embedded newlines, but if you have this, then I suspect your problems are a little deeper :-)

+117

Brian Agnew Mar 27 '13 at 16:10

source share

This solution is, of course, slower than some other find -> wc solutions, but if you were inclined to do something else with the file names in addition to counting them, you could read from the find output.

 n=0 while read -r -d ''; do ((n++)) # count # maybe perform another act on file done < <(find <expr> -print0) echo $n

This is just a modification of the solution found in BashGuide that correctly processes files with non-standard names, creating a separator for the output of find a NUL byte using print0 and reading from it using '' (NUL byte) as a loop separator.

+4

John B Jan 11 '14 at 10:51 on

source share

This is my countfiles function in my ~/.bashrc (it is fast enough, it should work for Linux and FreeBSD find ) and is not fooled by file paths containing newlines, and the final wc just counts NUL bytes):

 countfiles () { command find "${1:-.}" -type f -name "${2:-*}" -print0 | command tr -dc '\0' | command wc -c; return 0 } countfiles countfiles ~ '*.txt'

+2

carlo Mar 27 '13 at 17:04 on

source share

I like it when I come across a speed competition. Nothing wrong with using wc, but as long as we compare - here (I think) the most portable and fast solution: ``

 $ time (i=0; for d in /dev/sd*[az]; do ((i++)); done; echo $i) 25 real 0m0.001s user 0m0.000s sys 0m0.000s

Compared to using find / wc:

 $ time find /dev/sd*[az] | wc -l 25 real 0m0.006s user 0m0.000s sys 0m0.004s $ time find /dev/sd*[az] -printf . | wc -c 25 real 0m0.005s user 0m0.000s sys 0m0.000s

Note that if you need to consider hidden files, you will need to have 2 arguments in your for loop: for devfile in /dev/.* /dev/*; do ... for devfile in /dev/.* /dev/*; do ... And it still works faster.

Happy hack!

0

user.friendly May 31 '16 at 15:36

source share

Gilles Quenot · Accepted Answer · 2013-03-27 16:14

Try this instead ( find -printf support required):

 find <expr> -type f -printf '.' | wc -c

It will be more reliable and faster than row counting.

Note that I am using find printf and not an external command.

Say a little:

 $ ls -1 a e l ll.sh r t y z

My sample excerpt:

 $ time find -type f -printf '.' | wc -c 8 real 0m0.004s user 0m0.000s sys 0m0.007s

With full lines:

 $ time find -type f | wc -l 8 real 0m0.006s user 0m0.003s sys 0m0.000s

So my solution is faster =) (the important part is the real line)

What is the best way to calculate search results?

More articles: