Regular expression to find the smallest possible match

I use the JavaScript /(<mos>[\s\S]*?<\/mos>)/g regular expression to search for XML blocks in a log file that looks something like this:

 Entry 1: <mos>...</mos> Entry 2: <mos>...</mos> 

However, sometimes the logging process detects an error and does not complete writing to the file, in which case it looks like this:

 Entry 1: <mos>Error! Entry 2: <mos>...</mos> 

When this happens, the regular expression matches all values ​​from the open <mos> in record 1 to the closing </mos> in record 2, which creates problems when processing XML later.

It seems that somehow matching the closing tags, and then looking back at their respective opening tags, this will avoid this, but I don't know how to do it or if it is possible with regular expressions.


Explanation . ... in blocks separated by start and end tags, may contain newline characters.

0
javascript regex
source share
1 answer

This should fit your needs:

 <mos>((?:[\s\S](?!<mos>))+?)</mos> 

Regular expression visualization

Visualization of Debuggex

Demo on RegExr


Remember to escape the slash if using the regular expression JE literal.

+2
source share

All Articles