How to determine the last line in awk before END

I am trying to add the last line to the file that I am creating. How can I find the last line of a file in awk before END ? I need to do this because the variables do not work in the END block, so I try to avoid using END .

 awk ' { do some things..; add a new last line into file;}' 

before END , I don't want this:

 awk 'END{print "somethins new" >> "newfile.txt"}' 

Thank you very much in advance.

+6
source share
6 answers

One option is to use the getline function to process the file. It returns 1 in sucess, 0 at the end of the file, and -1 on error.

 awk ' FNR == 1 { ## Process first line. print FNR ": " $0; while ( getline == 1 ) { ## Process from second to last line. print FNR ": " $0; } ## Here all lines have been processed. print "After last line"; } ' infile 

Assuming infile with this data:

 one two three four five 

The output will be:

 1: one 2: two 3: three 4: four 5: five After last line 
+10
source
 $ cat file 1 2 3 4 5 

After reading the same file twice (recommended)

 $ awk 'FNR==NR{last++;next}{print $0, ((last==FNR)?"I am Last":"")}' file file 1 2 3 4 5 I am Last 

Using getline

 $ awk 'BEGIN{while((getline t < ARGV[1]) > 0)last++;close(ARGV[1])}{print $0, ((last==FNR)?"I am Last":"")}' file 1 2 3 4 5 I am Last 
+5
source

You can get the number of lines in a file using "wc -l" | getline filesize "wc -l" | getline filesize in the start block and use NR == filesize to check the last line in the body of the script.

+2
source

Try the following:

 awk '{ if (NR>1) { # process str print str; } str=$0; } END { # process whatever needed before printing the last line and then print print str; }' 
+2
source

You can use ENDFILE , it runs before END :

 $ awk 'END {print "end"} ENDFILE{print "last line"}' /dev/null /dev/null last line last line end 

ENDFILE exists in the latest version of awk (> 4.0, I think).

+1
source

I know the answer has been accepted, but this is simply wrong.

Because you want to use awk as a parser, not as code.

Awk should be used on some Unix channels and should not be used on any logic.

I had the same problem and solved it in awk as follows:

NLINES = wc -l <file>

cat | awk -v nl = $ {nlines} '{if (nl! = NR) {print $ 0, ",", "\";} else {print;}}' → $ {someout}

There is an important point here: pipes, flash and RAM.

If you make awk spit out your output, you can pass it on to the next processor.

If you use getline, and in particular in a loop, you may not see the end.

getline should be used only for a line and possible dependence on the next line.

I love awk, but we can't do everything with it!

Edition:

For whom I voted for the answer, I just want to introduce this script:

 #! /bin/sh # # Generate random strings cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 100000 > xr100000 cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1000000 > xr1000000 cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 5000000 > xr5000000 # # To save you time in case #cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 10000000 > xr10000000 # # Generate awk files cat <<"EOF" > awkGetline.sh #! /bin/sh # awk ' FNR == 1 { ## Process first line. print FNR ": " $0; while ( getline == 1 ) { ## Process from second to last line. print FNR ": " $0; } } ' xr # EOF # chmod +x awkGetline.sh # cat <<"EOF" > awkPlain.sh #! /bin/sh # awk ' {print FNR ": " $0;} ' xr # EOF # # xr100000 # chmod +x awkPlain.sh # # Execute awkGetline.sh 10 times on xr100000 rm -f xt cp xr100000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkGetline.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Getln", sum;}' | grep SUM # # # Execute awkPlain.sh 10 times on xr100000 rm -f xt cp xr100000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkPlain.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Plain", sum;}' | grep SUM # # # xr1000000 # chmod +x awkPlain.sh # # Execute awkGetline.sh 10 times on xr1000000 rm -f xt cp xr1000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkGetline.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Getln", sum;}' | grep SUM # # # Execute awkPlain.sh 10 times on xr1000000 rm -f xt cp xr1000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkPlain.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Plain", sum;}' | grep SUM # # # xr5000000 # chmod +x awkPlain.sh # # Execute awkGetline.sh 10 times on xr5000000 rm -f xt cp xr5000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkGetline.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Getln", sum;}' | grep SUM # # # Execute awkPlain.sh 10 times on xr5000000 rm -f xt cp xr5000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkPlain.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Plain", sum;}' | grep SUM # exit; # To save you time in case # # xr10000000 # chmod +x awkPlain.sh # # Execute awkGetline.sh 10 times on xr10000000 rm -f xt cp xr10000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkGetline.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Getln", sum;}' | grep SUM # # # Execute awkPlain.sh 10 times on xr10000000 rm -f xt cp xr10000000 xr for runInstance in 1 2 3 4 5 6 7 8 9 10; do /usr/bin/time -p -a -o xt ./awkPlain.sh > x.1.out; done; # cat xt | grep real | awk 'BEGIN {sum=0.0} {sum=sum+$2; print $2, sum/10;} END {print "SUM Plain", sum;}' | grep SUM # 

And, of course, the first results:

 tmp]$ ./awkRun.sh SUM Getln 0.78 SUM Plain 0.71 SUM Getln 7.2 SUM Plain 6.49 SUM Getln 35.91 SUM Plain 32.92 

If you save about 10% of the time just because of getline.

Consider this in a more complex logic, and you can get even the worst image. This simple version does not count memory. And it seems they do not play a role for this simple version. But memory can also play a role if you move on to more complex logic ...

Of course, try on your car.

That is why I suggested considering other options in general.

0
source

Source: https://habr.com/ru/post/923861/


All Articles