How to extract text from a string using sed?

My sample line looks like this:

This is 02G05 a test string 20-Jul-2012 

Now from this line I want to extract 02G05 . For this, I tried the following regex with sed

 $ echo "This is 02G05 a test string 20-Jul-2012" | sed -n '/\d+G\d+/p' 

But the above command does not print anything, and I believe that it cannot match anything with the pattern that I passed sed.

So my question is what I'm doing wrong here and how to fix it.

When I try to use the above string and pattern with python, I get my result

 >>> re.findall(r'\d+G\d+',st) ['02G05'] >>> 
+55
bash regex sed
Jul 19 '12 at 20:34
source share
5 answers

The \d pattern may not be supported by your sed . Instead, try [0-9] or [[:digit:]] .

To print only the actual match (not the entire match string), use wildcard.

 sed -n 's/.*\([0-9][0-9]*G[0-9][0-9]*\).*/\1/p' 
+51
Jul 19 '12 at 20:39
source share

How about using egrep ?

 echo "This is 02G05 a test string 20-Jul-2012" | egrep -o '[0-9]+G[0-9]+' 
+57
Jul 19 2018-12-12T00:
source share

sed does not recognize \d , use [[:digit:]] instead. You also need to exit + or use the -r switch ( -E on OS X).

Note that [0-9] also works for Arabic-Hindu numbers.

+4
Jul 19 '12 at 20:37
source share

Try this instead:

 echo "This is 02G05 a test string 20-Jul-2012" | sed 's/.* \([0-9]\+G[0-9]\+\) .*/\1/' 

But note: if there are two patterns on one line, it will print the second.

+3
Jul 19 '12 at 20:40
source share

Try using rextract ( https://github.com/kata198/rextract )

which allows you to extract text using a regular expression and reformat it.

Example:

[$] echo "This is a 02G05 test line 20-Jul-2012" | ./rextract '([\ d] + G [\ d] +)' '$ {1}'

2G05

0
Sep 13 '16 at 3:03
source share



All Articles