Search for a regular expression that matches all words except those [inside brackets]

I am trying to write a regular expression that matches the whole word inside a specific line, but skips words inside brackets. I currently have one regex that matches all the words:

/[a-z0-9]+(-[a-z0-9]+)*/i 

I also have a regex that matches all the words inside the brackets:

 /\[(.*)\]/i 

I basically want to match everything that matches the first regular expression, but not everything that matches the second regular expression.

Example input text: http://gist.github.com/222857 It must correspond to each word separately, without one in brackets.

Any help is appreciated. Thanks!

+4
source share
6 answers

Perhaps you could do this in two steps:

  • Remove all text in brackets.
  • Use a regular expression to match the remaining words.

Using one regex to try to do both of these things will be more complicated than it should be.

+3
source

How to do it:

 your_text.scan(/\[.*\]|([a-z0-9]+(?:-[a-z0-9]+)*)/i) - [[nil]] 
+1
source

What version of Ruby are you using? If it's 1.9 or later, this should do what you want:

 /(?<![\[a-z0-9-])[a-z0-9]+(-[a-z0-9]+)*(?![\]a-z0-9-])/i 
+1
source

I donโ€™t think I understood the question correctly. Why not just create a new line that does not contain a second regular expression:

 string1 =~ s/\[(.*)\]//g 

At the top of my head will not match what you deleted while storing the result in line1? I have not tested this yet. I can check it out later.

0
source

I agree with Schnap. Without additional information, this sounds like the easiest way - to delete what you do not want. but it should be /โ–บ(.*?))/. After that you can divide by \ s.

If you try to repeat every word and want each word to match, you can trick a little: string.split (/ \ W + /). You will lose quotes and what not, but you will get every word.

0
source

It works:

 [^\[][a-z0-9]+(-[a-z0-9]+)* 

if the first letter of a word is an opening bracket, it does not match it.

btw, is there a reason why you capture words with dashes in them? If this is not needed, your regular expression can be simplified.

0
source

All Articles