How to parse this using anchor grammar?

I am trying to make a parser using pegjs . I need to parse something like:

blah blah START Lorem ipsum dolor sit amet, consectetur adipiscing elit END foo bar etc. 

It's hard for me to write a rule to catch text from "START" to "END" .

+7
source share
1 answer

Use negative prediction predicates:

 phrase =(!"START" .)* "START" result:(!"END" .)* "END" .* { for (var i=0;i<result.length;++i) // remove empty element added by predicate matching {result[i]=result[i][1]; } return result.join(""); } 

You need to use a negative predicate for END as well as START, because repetition in pegjs is greedy.

Alternatively, the action can be written as

 {return result.join("").split(',').join("");} 

Although this depends on the not necessarily documented behavior of join when working with nested arrays (namely, that it joins subarrays with commas and then concatenates them). C>

[UPDATE] A shorter way to handle empty elements is

 phrase =(!"START" .)* "START" result:(t:(!"END" .){return t[1];})* "END" .* { return result.join(""); } 
+10
source

All Articles