The most effective regular expression to check if a string contains at least 3 alphanumeric characters

I have this regex:

(?:.*[a-zA-Z0-9].*){3} 

I use it to see if a string contains at least 3 alphanumeric characters in it. This seems to work.

Examples of strings that must match:

 'a3c' '_0_c_8_' ' 9 9d ' 

However, I need it to work faster. Is there a better way to use regex to match the same patterns?


Edit: I ended up using regex for my purposes:

 (?:[^a-zA-Z0-9]*[a-zA-Z0-9]){3} 

(no modifiers required)

+7
javascript regex
source share
3 answers

The most effective approach using regular expressions is to use the principle of contrast, that is, using opposite classes of characters side by side. Here is a regular expression that can be used to check if line 3 of a Latin script contains letters or numbers:

 ^(?:[^a-zA-Z0-9]*[a-zA-Z0-9]){3} 

See the demo .

If you need a complete string match, you need to add .* (Or .*$ If you want to guarantee that you will fit all the ends of the line / line), but in my regexhero tests,. .* Gives better performance):

 ^(?:[^a-zA-Z0-9]*[a-zA-Z0-9]){3}.* 

In addition, much depends on the engine. PCRE has automatic in-place optimization, which consists of auto-hold (i.e. Turns * to *+ into (?:[^a-zA-Z0-9]*+ ).

For more details, see more about optimizing password validation .

+6
source share
 (?:.*?[a-zA-Z0-9]){3}.* 

You can use this. It is much faster and takes much less steps than yours. See the demo. You probably want to use ^$ anchors to make sure there are no partial matches.

https://regex101.com/r/nS2lT4/32

Cause

 (?:.*[a-zA-Z0-9].*){3} ^^ 

This actually consumes the entire string, and then the engine should return. Using another regular expression can avoid this

+3
source share

Just think about it. Regular expressions are powerful because they are expressive and very flexible (with features like forward lookup, greedy consumption and backtracking). It will almost always be worth that little.

If you need raw speed (and you are ready to give up expressiveness), you may find that it bypasses ordinary expressions faster and just evaluates the string, for example, with the following pseudo-code:

 def hasThreeAlphaNums(str): alphanums = 0 for pos = 0 to len(str) - 1: if str[pos] in set "[a-zA-Z0-9]": alphanums++ if alphanums == 3: return true return false 

It is a parser (very simple in this case), a tool that can be even more powerful than regular expressions. For a more specific example, consider the following C code:

 #include <ctype.h> int hasThreeAlphaNums (char *str) { int count = 0; for (int ch = *str; ch != '\0'; str++) if (isalnum (ch)) if (++count == 3) return 1; return 0; } 

Now, regarding whether it’s faster or faster for this particular case, which depends on many factors, for example, whether the language is interpreted or compiled, how effective the regular expression under covers, etc.

So why the mantra of optimization is “Measure, don't guess!” You must evaluate the opportunities in your target environment.

+2
source share

All Articles