Separating delimited input in awk

Question

Separating delimited input in awk

We saw many posts asking a similar question. I can not make it work.

The input looks like this:

<field one with spaces>|<field two with spaces>

Trying to parse with awk.

I tried many options from great posts:

 FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$"; FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$"; FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$";

Still unable to get the channel limiter to work.

Using CentOS.

Any help?

+4

awk parsing pipe

scorpdaddy Aug 2 '11 at 19:56

source share

1 answer

shellter · Answer 1 · 2011-08-02T20:00:26+0000

  echo "field one has spaces | field two has spaces" \ | awk ' BEGIN { FS="|" } { print $2 print $1 # or what ever you want }' #output field two has spaces field one has spaces

You can also reduce this value to

 awk -F'|' { print $2 print $1 }'

Edit In addition, not all awks can accept a multi-character regular expression for an FS value.

Edit2 Somehow I skipped this initially, but I see that you are trying to include \x00 in the char classes pre and post | char. I assume you mean \x00 == null char? I don't think you can have awk parsing a file with nested null characters. You can prepare your input, for example

  tr '\x00' ' ' < file.txt > spacesForNulls.txt

OR delete them altogether with

 tr -d '\x00' < file.txt > deletedNulls.txt

and eliminate this part of your regular expression. But, as stated above, some awk do not support regex for the FS value. And I don't use the tr trick very often, you may find that null char requires a slightly different notation, depending on your version of tr .

Hope this helps.

Separating delimited input in awk

More articles: