How to save / organize compiled regular expression (std :: regex) in a file?

I am using <regex> from Visal Studio 2010. I understand that when I create a regular expression object, it compiles. There is no compilation method like in other languages โ€‹โ€‹and libraries, but I think how it works, am I right?

I need to store a large number of these compiled regular expressions in a file, so I just get a piece of memory block and get a compiled regular expression.

I canโ€™t figure out how to do this. I found that in PCRE this is possible , but it is a Linux library. There is Windows [version 2 , but it is 3 years old, and I would like to use a higher-level approach (there is no C ++ shell in the Windows version).

So is it possible to use save std:regex or boost::regex (is it the same?) As a piece of memory, and then just reuse it later?

Or is there another simple library for Windows that allows this?

EDIT: Thanks for the great answers. I will just check if it is enough to just store the regular expression as a string, and then if it is slow, I will test and compare it with this old PCRE library.

+2
c ++ regex serialization visual-c ++ boost-regex
source share
3 answers

I do not think that this can be done without changing the boost library to support it.

I donโ€™t know how the regex boost library is implemented, but most regex libraries compile things into a binary block, which is then interpreted later as a sequence of instructions for a kind of limited virtual machine.

If the extended regular expression library is implemented in this way, serializing it will be relatively simple. Just somehow get into the binary block and unload it to disk. The existence of the POSIX regex API for the acceleration library tells me that this is probably the way it is implemented.

OTOH, another way to implement it (and a less common way) is to create something like an abstract syntax tree for regular expression. This means that the individual parts of the regular expression will be represented by their own objects, and these objects will be connected to each other in a larger structure that would represent the entire regular expression.

If boost does it this way, serialization will be very difficult.

This is not possible for C ++, but I really wanted boost to be able to compile regular expressions of the constant string at compile time with the template metaprogram. The reason for this is impossible because it is not possible to iterate the contents of a string (even a constant string) using a template.

+1
source share

You can use regular expression strings yourself as "serialized" regular expressions - just save them to a file, and then when you want to restore regex objects, just pass the saved strings to the regex constructor.

The only flaws I can think of:

  • it may take some more time to โ€œrestoreโ€ the regular expression database, but I really donโ€™t know how much (I suspect that the time will be dominated by I / O anyway, so I'm not sure if the difference would be significant - I I really donโ€™t know how much overhead it takes to compile regular expressions using the boost library implementation)
  • If you want saved regular expressions to get confused, you will have to do it yourself, instead of relying on a compiled binary state to be unreadable

Advantages for this:

  • it is 100% supported, therefore it is not fragile / fragile
  • it is transported through compiler versions and platforms (i.e. not fragile / fragile)

Is the time to compile a regular expression database (with the exception of I / O) significant enough to guarantee that the compiled state is preserved?

+2
source share

I'm not sure, but did you take a look at boost :: serialization , which can serialize a C ++ object?

0
source share

All Articles