How to split a delimited string into an array in awk?

Question

How to split a delimited string into an array in awk?

How to split a string when it contains channel characters in it | . I want to split them into an array.

I tried

 echo "12:23:11" | awk '{split($0,a,":"); print a[3] a[2] a[1]}'

Which works great. If my string is like "12|23|11" , then how do I split them into an array?

+85

string split unix awk

Mohamed Saligh Nov 04 '11 at 13:10

source share

7 answers

To split a string into an array in awk , we use the split() function:

  awk '{split($0, a, ":")}' # ^^ ^ ^^^ # | | | # string | delimiter # | # array to store the pieces

If no delimiter is specified, it uses FS , by default it is a space:

 $ awk '{split($0, a); print a[2]}' <<< "a:bc:de" c:d

We can give a separator, for example::

 $ awk '{split($0, a, ":"); print a[2]}' <<< "a:bc:de" bc

This is equivalent to installing it through FS :

 $ awk -F: '{split($0, a); print a[1]}' <<< "a:bc:de" bc

In gawk, you can also provide a delimiter as a regular expression:

 $ awk '{split($0, a, ":*"); print a[2]}' <<< "a:::bc::de" #note multiple : bc

And even see what the separator was at every step, using its fourth parameter:

 $ awk '{split($0, a, ":*", sep); print a[2]; print sep[1]}' <<< "a:::bc::de" bc :::

Quote man page:

split (string, array [, fieldsep [, seps]])
Divide the string into pieces separated by the sepe field, and store the shapes in the array and the break lines in the seps array. The first part is stored in array 1 , the second part in array [2], etc. the string value of the third argument, fieldsep, is a regular expression describing where to split the string (since FS can be a regular expression describing where the input entries are split). If fieldsep is omitted, the FS value is used. split () returns the number of items created. seps is a gawk extension, with seps [i] being the separation line between array [i] and array [i + 1]. If fieldsep is a single space, then any leading spaces fall into seps [0], and any trailing spaces fall into seps [n], where n is the return value of split () (i.e., the number of elements in the array).

+36

fedorqui Mar 24 '16 at 23:28

source share

Please be more specific! What does it mean "doesn't work"? Send the exact result (or error message), OS version and awk:

 % awk -F\| '{ for (i = 0; ++i <= NF;) print i, $i }' <<<'12|23|11' 1 12 2 23 3 11

Or using split:

 % awk '{ n = split($0, t, "|") for (i = 0; ++i <= n;) print i, t[i] }' <<<'12|23|11' 1 12 2 23 3 11

Edit: On Solaris, you will need to use POSIX awk (/ usr / xpg4 / bin / awk) to properly process 4000 fields.

+11

Dimitre Radoulov Nov 04 2018-11-11T00:

source share

 echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

+3

Schildmeijer Nov 04 '11 at 13:15

source share

 echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

must work.

+2

codaddict Nov 04

source share

I do not like the solution echo "..." | awk ... echo "..." | awk ... because it makes unnecessary fork and exec system calls.

I prefer a Dimitre solution with a slight twist

 awk -F\| '{print $3 $2 $1}' <<<'12|23|11'

Or a slightly shorter version:

 awk -F\| '$0=$3 $2 $1' <<<'12|23|11'

In this case, the output record is combined, which is a true condition, therefore it is printed.

In this particular case, the stdin redirection can be saved by setting the awk internal variable:

 awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'

I used ksh for quite some time, but in bash this can be controlled by internal string manipulation. In the first case, the original string is separated by an internal terminator. In the second case, it is assumed that the string always contains pairs of numbers separated by a single character delimiter.

 T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*} T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}

Result in all cases

+2

TrueY Feb 10 '16 at 10:12

source share

Joke?:)

How about echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}' echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

This is my conclusion:

 p2> echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}' 112312

so I think it works in the end ..

+1

duedl0r Nov 04

source share

Calin Paul Alexandru · Accepted Answer · 2011-11-04 13:15

You tried:

 echo "12|23|11" | awk '{split($0,a,"|"); print a[3],a[2],a[1]}'

How to split a delimited string into an array in awk?

More articles: