I wrote a function that is better than str_word_count , because this PHP function considers dashes and other characters as words.
Also my function solves the problem of double spaces, which many of the functions written by other people are not taken into account.
This function also processes HTML tags. If you had two tags nested together and just used the strip_tags function, that would be considered one word when it is two. For example: <h1>Title</h1>Text or <h1>Title</h1><p>Text</p>
In addition, I first exclude JavaScript, but the code in the <script> tags will be counted as words.
Finally, my function handles spaces at the beginning and end of a line, several spaces and line breaks, and returns tab and tab characters.
function count_words($str) { $str = preg_replace("/[^A-Za-z0-9 ]/","",strip_tags(str_replace('<',' <',str_replace('>','> ',str_replace(array("\n","\r","\t"),' ',preg_replace('~<\s*\bscript\b[^>]*>(.*?)<\s*\/\s*script\s*>~is','',$str)))))); while(substr_count($str,' ')>0) { $str = str_replace(' ',' ',$str); } return substr_count(trim($str,' '),' ')+1; }
Sean gallagher
source share