I am wondering ... why the modules I saw are pragmas and not the more traditional OO / procedure interface.
This is likely because the Perl regex API, documented in perldoc perlreapi and available since 5.9.5, allows you to use a Perl parser that gives you many interesting functions with a little code.
If you use the API, you:
- no need to implement your own version of
split and the substitution operator s/// - no need to write your own code to parse regular expression modifiers (
msixpn are passed as flags in the implementation callback functions) - can take advantage of optimization benefits, such as constant regular expressions compiled only once (at compile time) and regular expressions containing interpolated variables compiled only when the variables change
- can use
qr in your programs to quote regular expressions and easily interpolate them into other regular expressions. - can easily set numbered and named capture variables, for example.
$1 , $+{foo} - Do not force users of your engine to rewrite all of their code to use your API; they can just add pragma
Most likely I missed. The fact is that you get a lot of free code and free functionality with the API. For example, if you look at the re::engine::PCRE implementation, it is actually quite short (<400 lines of XS code).
Alternatives
If you're just looking for an easier way to implement your own regex engine, check out re::engine::Plugin , which allows you to write your own implementations in Perl instead of C / XS. Note that there is a long list of caveats , including support for split and s/// .
Alternatively, instead of implementing a fully customizable engine, you can extend the built-in engine using overloaded constants as described in perldoc perlre . This only works in constant regular expressions; you must explicitly convert the variables before interpolating them into a regular expression.
source share