Comparison of Regex and Manual. Which is faster?

when writing a script engine, I have features like (psuedo-code)

function is_whitespace?(char c){ return c==' ' || c=='\t' || c=='\r' || c=='\n'; } 

Ok, my question is, which is faster in most langugaes? This is either using regular expression, for example

 function is_whitespace?(char c){ return regex_match('\s',c); } 

The main languages โ€‹โ€‹I deal with are C #, C, and Ruby, even if they are completely platform dependent.

+6
performance regex parsing
source share
5 answers

Of course, four comparisons of small pieces of memory are significantly faster (and almost without memory) than creating, starting, and destroying a state machine.

+11
source share

Manual comparison is faster, regular expression comparison is faster.

Note that your two implementations are not equivalent if your system uses Unicode. The \s regular expression matches all Unicode spaces, while the manual comparison handles the main ASCII and does not even include the vertical tab and feed form characters, which are usually also considered empty.

If you are writing this in a high-level language, I would suggest using the is_whitespace () function already provided by your programming languages. A core function like this is almost always enabled.

So, at the end, the answer is "dependent." In some situations, additional programming using procedural code is required. In many cases, regular expression is fast enough and easier to maintain.

+3
source share

In most cases, a regular expression to search for something like a space character is very fast. You have many eyeballs that look at performance in leading regular expression implementations, and there are probably other areas of โ€œlow hanging fruitโ€ to optimize in other areas of your code.

Areas of poor regular expression performance are poorly written regular expressions. Tips - Avoid as much unnecessary return, grouping, and change as possible. Use something like "Regex Buddy" or Perl with "use re debug" to find out how many branches your regular expression takes.

Links are associated with some regular expression performance issues.

When in doubt, do comparative timings ...

Horor-Regex Encoding

Java Performance - Regex

+1
source share

after using the disk, regular expressions are almost always my performance bottleneck when I review my code. even for simple things like .split ("").

+1
source share

I cannot talk about C # or C, but I would not assume that in Ruby is not a regular form.

0
source share

All Articles