Here is an example:
$text = "A very nice únÌcÕdë text. Something nice to think about if you're into Unicode."; // $words = str_word_count($text, 1); // use this function if you only want ASCII $words = utf8_str_word_count($text, 1); // use this function if you care about i18n $frequency = array_count_values($words); arsort($frequency); echo '<pre>'; print_r($frequency); echo '</pre>';
Exit:
Array ( [nice] => 2 [if] => 1 [about] => 1 [you're] => 1 [into] => 1 [Unicode] => 1 [think] => 1 [to] => 1 [very] => 1 [únÌcÕdë] => 1 [text] => 1 [Something] => 1 [A] => 1 )
And the utf8_str_word_count() function, if you need it:
function utf8_str_word_count($string, $format = 0, $charlist = null) { $result = array(); if (preg_match_all('~[\p{L}\p{Mn}\p{Pd}\'\x{2019}' . preg_quote($charlist, '~') . ']+~u', $string, $result) > 0) { if (array_key_exists(0, $result) === true) { $result = $result[0]; } } if ($format == 0) { $result = count($result); } return $result; }
Alix axel
source share