Why use another regex engine (like PCRE) as a pragma?

I am curious how it is better to use a different regex engine instead of Perl by default, and why the modules I saw are a pragmatic and not a more traditional OO / procedure interface. I was wondering why this is so.

I saw several modules for replacing the regex Perl engine with PCRE (re :: engine :: PCRE), TRE (re :: engine :: TRE) or RE2 (re :: engine :: RE2) in this lexical context. I cannot find object oriented modules for creating / compiling regular expressions that use the other end. I am curious why someone decided to implement this functionality as a pragma, and not as a more typical module. Replacing the relx perl engine seems to be much more complicated (depending on the complexity of the API it provides) than creating an XS script that provides an API that already provides PCRE, TRE, and RE2.

+5
source share
1 answer

I am wondering ... why the modules I saw are pragmas and not the more traditional OO / procedure interface.

This is likely because the Perl regex API, documented in perldoc perlreapi and available since 5.9.5, allows you to use a Perl parser that gives you many interesting functions with a little code.

If you use the API, you:

  • no need to implement your own version of split and the substitution operator s///
  • no need to write your own code to parse regular expression modifiers ( msixpn are passed as flags in the implementation callback functions)
  • can take advantage of optimization benefits, such as constant regular expressions compiled only once (at compile time) and regular expressions containing interpolated variables compiled only when the variables change
  • can use qr in your programs to quote regular expressions and easily interpolate them into other regular expressions.
  • can easily set numbered and named capture variables, for example. $1 , $+{foo}
  • Do not force users of your engine to rewrite all of their code to use your API; they can just add pragma

Most likely I missed. The fact is that you get a lot of free code and free functionality with the API. For example, if you look at the re::engine::PCRE implementation, it is actually quite short (<400 lines of XS code).

Alternatives

If you're just looking for an easier way to implement your own regex engine, check out re::engine::Plugin , which allows you to write your own implementations in Perl instead of C / XS. Note that there is a long list of caveats , including support for split and s/// .

Alternatively, instead of implementing a fully customizable engine, you can extend the built-in engine using overloaded constants as described in perldoc perlre . This only works in constant regular expressions; you must explicitly convert the variables before interpolating them into a regular expression.

+5
source

All Articles