How to apply a shell command to many files in nested (and poorly shielded) subdirectories?

Question

How to apply a shell command to many files in nested (and poorly shielded) subdirectories?

I am trying to do something like the following:

for file in `find . *.foo` do somecommand $file done

But the command does not work, because the $ file is very strange. Since my directory tree has crappy file names (including spaces), I need to avoid the find . But none of the obvious shoots seem to work: -ls gives me fragments of a file with space separators -fprint not better.

I also tried: for file in " find. * .Foo -ls "; do echo $file; done - but that gives all of the responses from find in one long line. "; do echo $file; done - but that gives all of the responses from find in one long line.

Any clues? I am happy for any workaround, but disappointed that I can not understand this.

Thanks Alex

(Hi Matt!)

+6

bash shell find for-loop escaping

Alex Apr 15 '09 at 21:32

source share

6 answers

lhunath · Answer 1 · 2009-04-16 06:17

You have many answers that explain well how to do this; but for the sake of completion, I repeat and add to it:

xargs is only useful for interactive use (when you know that all your file names are equal - no spaces or quotes) or when used with the -0 option. Otherwise, it will break everything.

find is a very useful tool; putting it in the pipe file names in xargs (even with -0 ) is quite difficult, since find can do everything by itself using -exec command {} \; or -exec command {} + depending on what you want:

 find /path -name 'pattern' -exec somecommand {} \; find /path -name 'pattern' -exec somecommand {} +

The first line of somecommand contains one argument for each file recursively in /path , which matches pattern .

The latter starts somecommand with as many arguments as it is suitable for the command line immediately for recursive files in /path that match pattern .

Which one to use depends on somecommand . If it can take several arguments for the file name (for example, rm , grep , etc.), then the latter option is faster (since you run somecommand much less often). If somecommand accepts only one argument, you need the first solution. So take a look at somecommand man page.

More on find : http://mywiki.wooledge.org/UsingFind

In bash , for is an operator that iterates over arguments . If you do something like this:

 for foo in "$bar"

you pass the for one argument to iterate (note the quotes!). If you do something like this:

 for foo in $bar

you ask bash to take the contents of bar and divide it into parts where there are spaces, tabs or newlines (technically, any characters in IFS ) and use fragments of this operation as arguments for. These are NOT file names . Assuming the result of a tearing long string containing file names, wherever they are, falls into a bunch of file names, simply incorrect. As you just noticed.

Answer: Do not use for , this is obviously the wrong tool. The above find commands assume that somecommand is an executable in PATH . If this is a bash statement, you will need this construct instead (iterating over the output of find , as you tried, but safely):

 while read -r -d ''; do somebashstatement "$REPLY" done < <(find /path -name 'pattern' -print0)

This uses the while-read , which reads the parts of the find string until it reaches the NULL byte (which is used by -print0 to separate file names). Since NULL bytes cannot be part of file names (unlike spaces, tabs, and newlines), this is a safe operation.

If you don't need somebashstatement to be part of your script (for example, it does not change the script environment by keeping a counter or setting a variable or some such), then you can still use find -exec to run your bash statement:

 find /path -name 'pattern' -exec bash -c 'somebashstatement "$1"' -- {} \; find /path -name 'pattern' -exec bash -c 'for file; do somebashstatement "$file"; done' -- {} +

Here, -exec executes a bash command with three or more arguments.

The bash statement to execute.
A -- . bash will put this at $0 , you can put whatever you like here, really.
Your file name or file name (depending on whether you used {} \; or {} + ). The name of the file (s) ends with $1 (and $2 , $3 , ... if there is more than one, of course).

The bash statement in the first find runs somebashstatement with the file name as an argument.

The bash statement in the second find command launches a for ( ! ) Loop that iterates over each position parameter (which is what reduced the for - for foo; do - does syntax) and starts a somebashstatement with the file name as an argument. The difference between the very first find expression that I showed with -exec {} + is that we start only one bash process for a large number of file names, but another somebashstatement for each of these file names.

All of this is also well explained on the UsingFind page above.

Varkhan · Answer 2 · 2009-04-15 21:34

Instead of relying on the shell to do the job, rely on finding it:

 find . -name "*.foo" -exec somecommand "{}" \;

Then the file name will be properly escaped and will never be interpreted by the shell.

Tanktalus · Answer 3 · 2009-04-15 21:35

 find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand

This becomes messy if you need to run several shell commands for each element.

Alister Bulman · Answer 4 · 2009-04-15 21:35

xargs is your friend. You will also want to explore the -0 (zero) option with it. find (with -print0 ) will help create a list. There are some good examples on the Wikipedia page.

Another useful reason to use xargs is that if you have many files (dozens or more), xargs splits them into separate calls into any calls that are then launched (in the first wikipedia rm example)

andrewdotn · Answer 5 · 2009-04-16 01:49

 find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${@}"; do ...; done' "${0}"

dreynold · Answer 6 · 2009-04-16 21:24

I had to do something a while ago by renaming files so that they could live in Win32 environments:

 #!/bin/bash IFS=$'\n' function RecurseDirs { for f in "$@" do newf= echo "${f}" | sed -e 's/[\\/:\*\?#"\|<>]/_/g' if [ ${newf} != ${f} ]; then echo "${f}" "${newf}" mv "${f}" "${newf}" f="${newf}" fi if [[ -d "${f}" ]]; then cd "${f}" RecurseDirs $(ls -1 ".") fi done cd .. } RecurseDirs .

code>

This is probably a bit simplistic, it doesn’t avoid name conflicts, and I'm sure it can be done better, but it eliminates the need to use the base name in the search results (in my case) before performing my replacements.

I may ask, what are you doing with the found files?

How to apply a shell command to many files in nested (and poorly shielded) subdirectories?

More articles: