AWK use multiple spaces as a delimiter

Question

AWK use multiple spaces as a delimiter

I use the command below to merge two files using the first two columns.

awk 'NR==FNR{a[$1,$2]=substr($0,3);next} ($1,$2) in a{print $0, a[$1,$2] > "br0102_3.txt"}' br01.txt br02.txt

Now, by default, AWk uses spaces as delimiters. But my file may contain one space between two words, for example.

File 1:

 ABCD TEXT1 TEXT2 123123112312312312312312312312312312 BCDEFG TEXT3TEXT4 133123123123123123123123123125423423 QWERT TEXT5TEXT6 123123123123125456678786789698758567

File 2:

 ABCD TEXT1 TEXT2 12312312312312312312312312312 BCDEFG TEXT3TEXT4 31242342342342342342342342343 MNHT TEXT8 TEXT9 31242342342342342342342342343

I want the result file:

 ABCD TEXT1 TEXT2 123123112312312312312312312312312312 12312312312312312312312312312 BCDEFG TEXT3TEXT4 133123123123123123123123123125423423 31242342342342342342342342343 QWERT TEXT5TEXT6 123123123123125456678786789698758567 MNHT TEXT8 TEXT9 31242342342342342342342342343

Any clues?

+8

unix awk

Apurv Nov 10 '14 at 11:12

source share

2 answers

You use fixed-width fields so you can use gnu awk FIELDWIDTHS (or similar) to separate the fields, for example. if the second field is 15 characters from char 8 to char 23 inclusive in this file:

 $ cat file abc def ghi klm AAAAAAAB CDEFGH IJJJJ abc def ghi klm $ awk -v FIELDWIDTHS="7 15 4" '{print "<" $2 ">"}' file <def ghi > <BCDEFGH I> < def ghi >

Any solution that relies on a certain number of spaces between fields will fail if you have 1 or zero space between your fields.

If you want to break leading / trailing spaces from the target field (s):

 $ awk -v FIELDWIDTHS="7 15 4" '{gsub(/^\s+|\s+$/,"",$2); print "<" $2 ">"}' file <def ghi> <BCDEFGH I> <def ghi>

+4

Ed morton Nov 10 '14 at 14:49

source share

Etan reisner · Accepted Answer · 2014-11-10T11:26:59+0000

awk supports regex as an FS value, so you can specify a regex that matches at least two spaces. Something like -F '[[:space:]][[:space:]]+' .

 $ awk '{print NF}' File2 4 3 4 $ awk -F '[[:space:]][[:space:]]+' '{print NF}' File2 3 3 3

AWK use multiple spaces as a delimiter

More articles: