Performance matching

I have a generic database query function that performs the following checks every time an SQL query is issued:

  • if (preg_match('~^(?:UPDATE|DELETE)~i', $query) === 1)
  • if (preg_match('~^(?:UPDATE|DELETE)~iS', $query) === 1)
  • if ((stripos($query, 'UPDATE') === 0) || (stripos($query, 'DELETE') === 0))

I know that a simple call to strpos() is faster than executing preg_match() , however, since I call strIpos() twice , I'm really not sure if you need to work better.

The template S modifier in the second option also causes some confusion in my head, from the manual:

When the template will be used several times, it is worth spending more time analyzing it to speed up the time spent on matching. If this modifier is installed, then additional analysis is performed. At present, examining a pattern is only useful for non-fixed patterns that make it not have a single fixed start character.

In this case, the speed is not critical (otherwise I would not use this general query function), but I would still like to make it as fast as possible, while preserving its simplicity.

Which of the above options should I choose?


EDIT: I am running a simple test , and yet I cannot decide which method works best.

Below are the results for 10,000 attempts (total time in seconds):

 Array ( [match] => Array ( [stripos] => 0.0965 [preg_match] => 0.2445 [preg_match?] => 0.1227 [preg_match?S] => 0.0863 ) [no-match] => Array ( [stripos] => 0.1165 [preg_match] => 0.0812 [preg_match?] => 0.0809 [preg_match?S] => 0.0829 ) ) 

100,000 attempts :

 Array ( [match] => Array ( [stripos] => 1.2049 [preg_match] => 1.5079 [preg_match?] => 1.5564 [preg_match?S] => 1.5857 ) [no-match] => Array ( [stripos] => 1.4833 [preg_match] => 0.8853 [preg_match?] => 0.8645 [preg_match?S] => 0.8986 ) ) 

1,000,000 attempts :

 Array ( [match] => Array ( [stripos] => 9.4555 [preg_match] => 8.7634 [preg_match?] => 9.0834 [preg_match?S] => 9.1629 ) [no-match] => Array ( [stripos] => 13.4344 [preg_match] => 9.6041 [preg_match?] => 10.5849 [preg_match?S] => 8.8814 ) ) 

10,000,000 attempts :

 Array ( [match] => Array ( [stripos] => 86.3218 [preg_match] => 93.6755 [preg_match?] => 92.0910 [preg_match?S] => 105.4128 ) [no-match] => Array ( [stripos] => 150.9792 [preg_match] => 111.2088 [preg_match?] => 100.7903 [preg_match?S] => 88.1984 ) ) 

As you can see, the results change a lot, this makes me wonder if this is the right way to run the test.

+4
source share
2 answers

I went with the following regular expressions, as they seem faster (according to the agreed and inconsistent text):

  • if (preg_match('~^(?:INSERT|REPLACE)~i', $query) === 1)
  • else if (preg_match('~^(?:UPDATE|DELETE)~i', $query) === 1)
  • else if (preg_match('~^(?:SELECT|EXPLAIN)~i', $query) === 1)
0
source

I probably would not use any of them. I cannot be sure without benchmarking, but I think substr() will be faster than stripos , since it will not scan a whole line. Assuming UPDATE and DELETE always occur at the beginning of a query, and even better, they have exactly 6 characters, so you can do this in one substr() :

 $queryPrefix = strtoupper(substr($query,0,6)); if ($queryPrefix == 'UPDATE' || $queryPrefix == 'DELETE') { 

If you need to, you can add trim() there for any space prefix, but this is probably not necessary.

If you execute nested or subqueries using UPDATE and DELETE, then obviously this method will not work, and I would go with the stripos() route. If you can avoid regular expressions in favor of regular string functions, this will be faster and less complicated.

+2
source

All Articles