Is it possible to use two different field separators in awk and store values ​​from both variables?

I assume that the general question I have is, can I give awk a field separator, save one of the tokens in a variable, then give awk another field separator and save one of the tokens in the second variable, then print both values ​​of the variables? It seems that the variables store a reference to the $ nth token, and not to the value itself.

A concrete example that I had in mind more or less follows this form: {Animal}, {species} class

Cat, Felis catus MAMMAL Dog, Canis lupus familiaris MAMMAL Peregrine Falcon, Falco peregrinus AVIAN ... 

and you want it to output something like:

 Cat MAMMAL Dog MAMMAL Peregrine Falcon AVIAN ... 

Where what you want is what fits the form: {Animal} class

with something enclosed in {}, meaning that it can have any number of spaces.

My original idea: I would have something like this:

 cat test.txt | awk '{FS=","}; {animal=$1}; {FS=" "}; {class=$NF}; {print animal, class}; > animals.txt 

I expect the variable "animal" to keep what is to the left of the comma, and "class" to have a class type of this animal, so MAMMAL, etc. But what ends up is that only the last used Field Separator is applied, so that it will break for things that have spaces in the name, such as Peregrine Falcon, etc.

so it would look like

 Cat, MAMMAL Dog, MAMMAL Peregrine AVIAN 
+4
source share
4 answers

One way: awk :

 awk -F, '{ n = split($2,array," "); printf "%s, %s\n", $1, array[n] }' file.txt 

Results:

 Cat, MAMMAL Dog, MAMMAL Peregrine Falcon, AVIAN 
+6
source

You can always split() inside your awk script. You can also manipulate fields that cause the entire line to be reprocessed. For example, this gives the results in your question:

 awk '{cl=$NF; split($0,a,", "); printf("%s, %s\n", a[1], cl)}' test.txt 
+3
source

The field separator for awk can be any regular expression, but in this case it would be easier to use a record separator by setting it to [,\n] , which will alternate between the fields you want:

 awk -v RS='[,\n]' 'NR % 2 { printf("%s, ", $0) } NR % 2 == 0 { print $NF }' 

Thus, even fields are displayed completely, and odd fields display only the last field.

+3
source
 paste -d, <(cut -d, -f1 input.txt) <(awk '{print $NF}' input.txt) 
  • cut first column
  • awk get the last column
  • paste them together

output:

 Cat,MAMMAL Dog,MAMMAL Peregrine Falcon,AVIAN 
+2
source

All Articles