Replace regular expressions in place with Boost

I have a huge paragraph of text stored in std :: string called 'text'. In this line, I replace certain patterns with a space using the regex boost library. Here is my code.

// Remove times of the form (00:33) and (1:33) boost::regex rgx("\\([0-9.:]*\\)"); text = boost::regex_replace(text, rgx, " "); // Remove single word HTML tags rgx.set_expression("<[a-zA-Z/]*>"); text = boost::regex_replace(text, rgx, " "); // Remove comments like [pause], [laugh] rgx.set_expression("\\[[a-zA-Z]* *[a-zA-Z]*\\]"); text = boost::regex_replace(text, rgx, " "); // Remove comments of the form <...> rgx.set_expression("<.+?>"); text = boost::regex_replace(text, rgx, " "); // Remove comments of the form {...} rgx.set_expression("\\{.+?\\}"); text = boost::regex_replace(text, rgx, " "); // Remove comments of the form [...] rgx.set_expression("\\[.+?\\]"); text = boost::regex_replace(text, rgx, " "); 

From my point of view, every time I run the regex_replace function, it creates a new line and writes the output to it. If I run the regex_replace function with N different templates, it will highlight N new lines (deleting the old ones).

Since memory allocation is time consuming, is there a way to do an in-place replacement without highlighting a new line?

+5
source share
2 answers

regex_replace has two overloads, the one you are using right now, and the other iterators. You can specify an output iterator of the same range that you are working on.

 boost::regex_replace(text.begin(), text.begin(), text.end(), rgx, " "); 
+2
source

Since none of your regex replacements processes the output of the previous replacement steps, you can simply put all of these regular expressions in one large regular expression and run it once.

You can even specify different replacement strings for each part of the regular expression, but this is not necessary here.

 boost::regex rgx("(\\([0-9.:]*\\))|" "(<[a-zA-Z/]*>)|" "(\\[[a-zA-Z]* *[a-zA-Z]*\\])|" "(<.+?>)|" "(\\{.+?\\})|" "(\\[.+?\\])"); text = boost::regex_replace(text, rgx, " "); 
0
source

Source: https://habr.com/ru/post/1213371/


All Articles