Using XRegExp.matchRecursive for nested spaces

Question

Using XRegExp.matchRecursive for nested spaces

I want to get a way to get all the content between one open span tag and a close tag. The problem is that someday I can have a nested range, and I want to be sure that my regular expression does not stop the first ending interval that it sees.

To see my problem, look at this: Regex101: nested range

I want to be sure that I get everything between the open and close tags. no matter how much </span> I can find inside.

I found a library made by Stephen Levitan that could achieve my desires. The problem is that the example is basic, and I'm not sure that I can achieve what I want.

I am using the XregExp.matchRecursive method. In this example, they give a start tag and an end tag. My start tag is a bit complicated, it looks like this: <span style=\\?"color:([a-zA-Z\s]*?)\\?"> . The problem is that when I execute this method with this separator, I get this error: the line contains unbalanced separators . Checked string:

 <p style=\"text-align:justify\"> <span style=\"font-size:12pt\"> <span style=\"color:Green\"> <span style=\"font-family:Verdana\">There is some content for a mm advertisment.There is some co</span> <span style=\"font-family:Times New Roman\">ntent for a mm advertisment.</span> </span> </span> </p>

I think my problem is with the regex, which I use as a start delimiter. As explain in the doc , we need to add a backslash dump level in the regular expression. Therefore, I am trying to use this regular expression as a start delimiter: <span style=\\\\?"color:([a-zA-Z\\s]*?)\\\\?"> . Still not working. I don’t see how I can do this to find this method, to find everything between a range that has a color style attribute and its close tag.

Maybe someone has a solution?

+5

javascript html regex xregexp

Ganbin Jul 07 '15 at 10:27

source share

2 answers

Is it possible to use some kind of parser that is more powerful than regular expressions? The latter, generally speaking, are not very suitable for parsing irregular languages, although they can provide certain extensions compared to “pure” regular expressions in a theoretical sense.

+1

plamut Jul 07 '15 at 10:41

source share

randomsimon · Accepted Answer · 2015-07-07T11:06:46+0000

So, the block you click on is the error "The line contains unbalanced delimiters ."

This is because your start separator matches only one of the run start tags in the test input (the one that indicates the color), but your end separator matches all four end-range tags.

I think you'll have to get close to this by first matching all the span tags (with the library found) and then reprogramming to find the ones you need.

Using XRegExp.matchRecursive for nested spaces

More articles: