REGEX - matches the Nth word of a string containing a specific word

Question

REGEX - matches the Nth word of a string containing a specific word

I am trying to make the correct REGEX to complete this task:

Match the Nth word of a string containing a specific word

For instance:

Input:

this is the first line - blue this is the second line - green this is the third line - red

I want to match the word 7th of the lines containing the word " second "

Required Conclusion:

 green

Does anyone know how to do this?

I use http://rubular.com/ to test REGEX.

I already tried this REGEX without success - it matches the next line

 (.*second.*)(?<data>.*?\s){7}(.*)

--- UPDATED ---

Example 2

Input:

 this is the Foo line - blue this is the Bar line - green this is the Test line - red

I want to match the word 4th lines containing the word " red "

Required Conclusion:

 Test

In other words, the word I want to match can be obtained before or after the word that I use to select the line

+6

regex

Jorge Jan 31 '14 at 16:36

source share

2 answers

You requested a regex and you got a very good answer.

Sometimes you need to request a solution and not specify a tool.

Here is one airliner that, it seems to me, is best for you:

 awk '/second/ {print $7}' < inputFile.txt

Explanation:

 /second/ - for any line that matches this regex (in this case, literal 'second') print $7 - print the 7th field (by default, fields are separated by space)

I think this is much easier to understand than the regular expression, and it is more flexible for this kind of processing.

+3

Floris Jan 31 '14 at 17:27

source share

Jerry · Accepted Answer · 2014-01-31T16:42:09+0000

You can use this to match the line containing second , and grab the 7th word:

 ^(?=.*\bsecond\b)(?:\S+ ){6}(\S+)

Make sure global and multi-line flags are active.

^ matches the beginning of a line.

(?=.*\bsecond\b) is a positive look to make sure the word second is in this line.

(?:\S+ ){6} matches 6 words.

(\S+) will get the 7th.

regex101 demo

You can apply the same principle to other requirements.

With a line containing red and getting the 4th word ...

 ^(?=.*\bred\b)(?:\S+ ){3}(\S+)

REGEX - matches the Nth word of a string containing a specific word

More articles: