PHP: the best way to split a string into alphabetical and numeric components

I have some format lines

AA11 AAAAAA1111111 AA1111111 

What is the best (most efficient) way to separate the alphabetical and numeric components of a string?

+1
source share
5 answers

If all of them are a sequence of letters followed by a sequence of numbers without non-alphabetic characters, then sscanf () is probably more efficient than regular expressions

 $example = 'AAA11111'; list($alpha,$numeric) = sscanf($example, "%[AZ]%d"); var_dump($alpha); var_dump($numeric); 
+5
source

preg_split should do the job fine.

 preg_split('/(\w+)/', $input, -1, PREG_SPLIT_DELIM_CAPTURE); 

The Preg library is surprisingly efficient at processing strings, so I would suggest that it is more efficient than anything you can write manually using more primitive string functions. But do the test and see for yourself.

+1
source

Instead of using RegEx, you can immediately add one additional check, for example:

 if (ctype_alpha($testcase)) { // Return the value it only letters } else if(ctype_digit($testcase)) { // Return the value it only numbers } else { //RegEx your string to split nums and alphas } 

EDIT: Obviously, my answer did not provide evidence that would work better, so I did a test that produced the following result:

  • preg_split took 5.3319189548492 seconds
  • sscanf took 3.4432129859924 seconds

And the answer was to be sscanf

Here is the code that gave the result:

 $string = "AAAAAAAAAA111111111111111"; $count = 1000000; function prSplit($string) { return preg_split( '/([A-Za-z]+)/', $string, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY); } function sScanfTest($string) { return sscanf($string, "%[AZ]%[0-9]"); } function microtime_float() { list($usec, $sec) = explode(" ", microtime()); return ((float)$usec + (float)$sec); } $startTime1 = microtime_float(); for($i=0; $i<$count; ++$i) { prSplit($string); } $time1 = microtime_float() - $startTime1; echo '1. preg_split took '.$time1.' seconds<br />'; $startTime2 = microtime_float(); for($i=0; $i<$count; ++$i) { sScanfTest($string); } $time2 = microtime_float() - $startTime2; echo '2. sscanf took '.$time2.' seconds'; 
+1
source

Here is a working example using preg_split() :

 $strs = array( 'AA11', 'AAAAAA1111111', 'AA1111111'); foreach( $strs as $str) foreach( preg_split( '/([A-Za-z]+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY) as $temp) var_dump( $temp); 

This outputs :

 string(2) "AA" string(2) "11" string(6) "AAAAAA" string(7) "1111111" string(2) "AA" string(7) "1111111" 
+1
source

This seems to work, but when you try to transfer something like " 111111 ", it is not.

In my application, I expect several scenarios, and what seems to do the trick is

 $referenceNumber = "AAA12132"; $splited = preg_split('/(\d+)/', $referenceNumber, -1, PREG_SPLIT_DELIM_CAPTURE); var_dump($splited); 

Note :

  1. Getting an array of 2 elements means that the 0th index is alpha, and the 1st is numbers .
  2. Getting an array of just 1 element means that the 0th element is numeric and does not contain alpha.
  3. If you get more than 2 elements of the array, then your string should be in this format "AAA1323SDC"

Therefore, given the above, you can play with it depending on your use case.

Hooray!

0
source

All Articles