Regex: matching everything before the first underline and everything in between AFTER

I have an expression like

test_abc_HelloWorld_there could be more here. 
  • In a regular expression, I want the first word before to begin with an underscore. So get "test"

I tried [A-Za-z]{1,}_ , but that didn't work.

  • Then I would like to get "abc" or something in between the first 2 underscores.

2 Separate regular expressions not combined

Any help is much appreciated!

Example:

for 1) the regular expression will match the word test for 2) the regular expression will match the word abc

therefore, any other match for any case would be wrong. As with if I replaced what I matched, then I would get something like the following:

for case 1) correspond to "test" and replace "test" with "Goat".

 'Goat_abc_HelloWorld_there could be more here' 

I don't need a replacement, I just want a word match.

+4
source share
2 answers

In both cases, you can use statements.

 ^[^_]+(?=_) 

you get everything until the first underline of the line, and

 (?<=_)[^_]+(?=_) 

will match any string between two unserscores characters.

+15
source

Go back and think you might be overestimating the solution here. Ruby has a split method for this, other languages ​​probably have their own equivalents.

something like this "AAPL_annual_i.xls", you can just do it and take advantage of the fact that your data is already structured.

 string_object = "AAPL_annual_i.xls" ary = string_object.split("_") #=> ["AAPL", "annual", "i.xls"] extension = ary.split(".")[1] #=> ["xls"] filetype = ary[3].split(".")[0] #etc 

'DOH!

But seriously, I found that relying on the split method is not only easier on me, but easier on my partners, who need to read my code and understand what it does.

+3
source

All Articles