Extract multiple captured groups from sed to variables

I have the following text

abc <THIS> abc <THAT> abc <WHAT> abc 

where abc is a placeholder for a well-defined expression. I would like to highlight 3 words in brackets and store them in 3 separate variables. Can this be done indiscriminately 3 times? Basically, I would like to capture and somehow β€œexport” several groups.

It is clear that I can extract one of them as follows:

 VARIABLE=`echo $TEXT | sed "s_abc <\(.*\)> abc <.*> abc <.*> abc_\1_g"` 

But is it possible to get all 3 of them without running sed 3 times?

Other (portable) solutions without sed also welcome.

+8
unix bash shell sed macos
source share
3 answers

If there are any characters that, as you know, will not be displayed in THIS , THAT or WHAT , you can write something like this:

 IFS=$'\t' read -r VAR1 VAR2 VAR3 \ < <(sed 's/^abc <\(.*\)> abc <\(.*\)> abc <\(.*\)> abc$/\1\t\2\t\3/' \ <<< "$TEXT" ) 

telling sed use this separator in its output and read to use this separator in its input.

+10
source share

This may work for you (GNU sed and bash):

 line='abc <THIS> abc <THAT> abc <WHAT> abc' var=($(sed 's/[^<]*<\([^>]*\)>[^<]*/"\1" /g' <<<"$line")) echo "first ${var[0]} second ${var[1]} third ${var[2]}" first "THIS" second "THAT" third "WHAT" 
+5
source share

No need to start the process:

 var='abc <THIS> abc <THAT> abc <WHAT> abc' var1=${var#abc <} # Remove the leading 'abc <'. THIS="${var1%%> abc <*}" # Remove the longest trailing '> abc <*'. var2="${var1#*> abc <}" # Remove the shortest leading '*> abc <'. THAT="${var2%%> abc <*}" # Remove the longest trailing '> abc <*'. var3="${var2#*> abc <}" # Remove the shortest leading '*> abc <'. WHAT="${var3%> abc}" # Remove the trailing '> abc' echo "$THIS" echo "$THAT" echo "$WHAT" 
+2
source share

All Articles