There is always a discussion about which is faster, so I decided to run several tests using different methods.
Test run:
- Strpos
- preg_match with foreach loop
- preg_match with regex or
- indexed search with string to explode
- indexed search as an array (string already blown)
Two sets of tests that run. One on a large text document (114,350 words) and one on a small text document (120 words). In each set, all tests were performed 100 times, and then the average value was taken. The tests did not ignore the case, which would make them faster. The test for which the index was searched was pre-indexed. I wrote code to index myself, and I'm sure it was less efficient, but indexing for a large file took 17.92 seconds and for a small file 0.001 seconds.
Search terms: gazerbeam (not found in the document), legal (found in the document) and target (NOT found in the document).
Results in seconds to complete one test, sorted by speed:
Large file:
- 0.0000455808639526 (index without explosions)
- 0.0009979915618897 (preg_match using regex or)
- 0.0011657214164734 (strpos)
- 0.0023632574081421 (preg_match using foreach loop)
- 0.0051533532142639 (index with an explosion)
Small file
- 0.000003724098205566 (strpos)
- 0.000005958080291748 (preg_match using regex or)
- 0.000012607574462891 (preg_match using foreach loop)
- 0.000021204948425293 (index without explosions)
- 0.000060625076293945 (index with an explosion)
Note that strpos is faster than preg_match (using regex or) for small files, but slower for large files. Of course, other factors will affect this, such as the number of searches.
Algorithms Used:
//strpos $str = file_get_contents('text.txt'); $t = microtime(true); foreach ($search as $word) if (strpos($str, $word)) break; $strpos += microtime(true) - $t; //preg_match $str = file_get_contents('text.txt'); $t = microtime(true); foreach ($search as $word) if (preg_match('/' . preg_quote($word) . '/', $str)) break; $pregmatch += microtime(true) - $t; //preg_match (regex or) $str = file_get_contents('text.txt'); $orstr = preg_quote(implode('|', $search)); $t = microtime(true); if preg_match('/' . $orstr . '/', $str) {}; $pregmatchor += microtime(true) - $t; //index with explode $str = file_get_contents('textindex.txt'); $t = microtime(true); $ar = explode(" ", $str); foreach ($search as $word) { $start = 0; $end = count($ar); do { $diff = $end - $start; $pos = floor($diff / 2) + $start; $temp = $ar[$pos]; if ($word < $temp) { $end = $pos; } elseif ($word > $temp) { $start = $pos + 1; } elseif ($temp == $word) { $found = 'true'; break; } } while ($diff > 0); } $indexwith += microtime(true) - $t; //index without explode (already in array) $str = file_get_contents('textindex.txt'); $found = 'false'; $ar = explode(" ", $str); $t = microtime(true); foreach ($search as $word) { $start = 0; $end = count($ar); do { $diff = $end - $start; $pos = floor($diff / 2) + $start; $temp = $ar[$pos]; if ($word < $temp) { $end = $pos; } elseif ($word > $temp) { $start = $pos + 1; } elseif ($temp == $word) { $found = 'true'; break; } } while ($diff > 0); } $indexwithout += microtime(true) - $t;
James source share