Retrieving tokens from a regular expression string in .NET.

I am wondering if this is possible with Regex. I want to extract tokens from a line like:

Select a [COLOR] and a [SIZE]. 

Ok, simple enough - I can use (\[[AZ]+\])

However, I also want to extract text between tokens. Basically, I want the relevant groups to be higher:

 "Select a " "[COLOR]" " and a " "[SIZE]" "." 

What is the best approach for this? If there is a way to do this with RegEx, that would be great. Otherwise, I assume that I need to extract the tokens and then manually execute the MatchCollection loop and parse the substrings based on the indices and lengths of each match. Note that I need to keep the order of lines and tokens. Is there a better algorithm for this kind of string parsing?

+7
source share
2 answers

Use Regex.Split(s, @"(\[[AZ]+\])") - it should give you the exact array you are after. Split accepts the captured groups and converts them into tokens in the result array.

+11
source

Here is a Regex free method that uses String.Split , but you lose the delimiters.

  string s = "Select a [COLOR] and a [SIZE]."; string[] sParts = s.Split('[', ']'); foreach (string sPart in sParts) { Debug.WriteLine(sPart); } // Select a // COLOR // and a // SIZE // . 
0
source

All Articles