Saving leading space when reading >> Writing a file line by line in bash

I am trying to iterate over a directory of text files and combine them into one document. This works fine, but text files contain snippets of code, and all my formatting is reset to the left. All leading spaces on the line are separated.

#!/bin/sh OUTPUT="../best_practices.textile" FILES="../best-practices/*.textile" for f in "$FILES" do echo "Processing $f file..." echo "">$OUTPUT cat $f | while read line; do echo "$line">>$OUTPUT done echo >>$OUTPUT echo >>$OUTPUT done 

I admittedly bash noob, but after searching high and low, I could not find the right solution. Bash seems to hate leading white space in general.

+6
bash parsing text-files cat
source share
5 answers

Instead:

 cat $f | while read line; do echo "$line">>$OUTPUT done 

Do it:

 cat $f >>$OUTPUT 

(If there is a reason why you need to do something line by line, it would be nice to include this in the question.)

+3
source share

As others have pointed out, using cat or awk instead of a read-echo loop - a much better way to do this - avoids the problem of wrapping around spaces (and a few others you haven't come across) works faster, and at least with a cat, this just clean code. However, I would like to take a hit to make the read-echo cycle work correctly.

First, the problem of trimming spaces: the read command automatically trims leading and trailing spaces; this can be fixed by changing its space definition by setting the IFS variable to empty. In addition, reading assumes that the backslash at the end of a line means that the next line is a continuation and should be spliced ​​along with it; To fix this, use its -r (raw) flag. The third problem is that many implementations of echo interpretations of escape sequences in a string (for example, they can turn \ n into an actual newline string); To fix this, use printf instead. Finally, just as a rule of general hygiene of styles, you should not use a cat when you really do not need it; use input redirection instead. With these changes, the inner loop looks like this:

 while IFS='' read -r line; do printf "%s\n" "$line">>$OUTPUT done <$f 

... there are also a number of other problems with the surrounding script: a line that tries to determine FILES since the list of available .textile files contains quotation marks around it, which means that it never expands to the actual list of files. The best way to do this is to use an array:

 FILES=(../best-practices/*.textile) ... for f in "${FILES[@]}" 

(and all occurrences of $ f should be in double quotes if any of the file names have spaces or other funny characters in them - this should really do this with $ OUTPUT, although since it is defined in the script it is really safe to leave it.)

Finally, there echo "">$OUTPUT next to the top of the binding files that will erase the output file every time through (that is, at the end, it contains only the last .textile file); it needs to be carried over to the cycle. I'm not sure if the intention here was to put an empty line at the beginning of the file or three empty lines between the files (and one at the beginning and two at the end), so I'm not sure what the appropriate replacement is. Anyway, here is what I can solve after fixing all these problems:

 #!/bin/sh OUTPUT="../best_practices.textile" FILES=(../best-practices/*.textile) : >"$OUTPUT" for f in "${FILES[@]}" do echo "Processing $f file..." echo >>"$OUTPUT" while IFS='' read -r line; do printf "%s\n" "$line">>"$OUTPUT" done <"$f" echo >>"$OUTPUT" echo >>"$OUTPUT" done 
+40
source share

which is too expensive a way to merge files.

 cat ../best-practices/*.textile > ../best_practices.textile 

if you want to add an empty (new line) to each file as you merge, use awk

 awk 'FNR==1{print "">"out.txt"}{print > "out.txt" }' *.textile 

OR

 awk 'FNR==1{print ""}{print}' file* > out.txt 
+3
source share

This allows you to interpolate newlines between each input file, as it was in the original script:

 for f in $FILES; do echo -ne '\n\n' | cat "$f" -; done > $OUTPUT 

Note that $FILES not used for this to work (otherwise, additional lines of a new line appear only once at the end of all output), but $f must be specified to protect spaces in file names, if they exist.

+1
source share

The correct answer, imo, this , reproduced below:

 while IFS= read line; do check=${line:0:1} done < file.txt 

Note that he will take care of situations where the input is passed from another command, and not just from the actual file.

Please note that you can also simplify redirection as shown below.

 #!/bin/bash OUTPUT="../best_practices.textile" FILES="../best-practices/*.textile" for f in "$FILES" do echo "Processing $f file..." { echo while IFS= read line; do echo "$line" done < $f echo echo; } > $OUTPUT done 
0
source share

All Articles