Meaning of a simple preg_replace pattern (# \ s + #)?

Sorry for the simplest question, but there is simply no easy way to find such a string here either on Google or in SymbolHound . Also did not find the answer in PHP Manual ( Template Syntax and preg_replace ).

This code is inside a function that receives the parameters $content and $length .
What is preg_replace ?

 $the_string = preg_replace('#\s+#', ' ', $content); $words = explode(' ', $the_string); if( count($words) <= $length ) 

Also, would it be better to use str_word_count instead?

+4
source share
3 answers

This pattern replaces consecutive whitespace characters (note, not just spaces, as well as line breaks or tabs) with one regular space (''). \s+ says that "matches a sequence of one or more space characters".

The # characters are delimiters for the pattern. Probably more common is viewing patterns separated by forward slashes. (Actually, you can do REGEX in PHP without separators, but it does matter how the template is processed, which is beyond the scope of this question / answer).

http://php.net/manual/en/regexp.reference.delimiters.php

Relying on spaces to find words in a string is usually not the best approach - instead, we can use the \b border marker.

 $sentence = "Hello, there. How are you today? Hope you're OK!"; preg_match_all('/\b[\w-]+\b/', $sentence, $words); 

This says: take all the substrings in the larger string, which consist only of alphanumeric characters or hyphens and are enclosed in word boundaries.

$words now an array of words used in a sentence.

+3
source

\s+ used to match multiple spaces. You replace them with one space using preg_replace('#\s+#', ' ', $content);

str_word_count may be appropriate, but you may need to specify additional characters that are considered words, or the function reports incorrect values ​​when using UTF-8 characters.

 str_word_count($str, 1, characters_that_are_not_considered_word_boundaries); 

An example :

 print_r(str_word_count('holóeóó what',1)); 

returns

 Array ( [0] => hol [1] => e [2] => what ) 
+1
source

# delimiter

Commonly used delimiters are slashes (/), hash signs (#), and tildes (~). Below are all examples of valid separation patterns.

 $the_string = preg_replace('#\s+#', ' ', $content); 

it will replace multiple spaces ( \s ) with one space

+1
source

All Articles