Get popular words in PHP + MySQL

How do I get the most popular words from multiple content tables in PHP / MySQL.

For example, I have a forum_post table with a forum post; it contains a theme and content. In addition, I have several other tables with different fields, which may also contain content for analysis.

I will probably go for all the content myself, strip (possibly) html will explode the string in spaces. remove quotes and commas, etc. and just count words that are not common, storing an array during the run of all words.

My main question is: does anyone know about a method that can be simpler or faster.

I could not find useful answers about this, these may be incorrect search patterns.

+8
php mysql
source share
2 answers

Someone has already done this.

The magic you are looking for is a php function called str_word_count () .

In my code example below, if you get a lot of extra words from this, you will need to write a custom description to remove them. In addition, you will want to remove all html tags with words and other characters.

I am using something similar to this to generate keywords (obviously the code is property). In short, we take the provided text, we check the word frequency, and if the words appear in order, we sort them in an array based on priority. Therefore, the most common words will be the first to exit. We do not take into account words that occur only once.

<?php $text = "your text."; //Setup the array for storing word counts $freqData = array(); foreach( str_word_count( $text, 1 ) as $words ){ // For each word found in the frequency table, increment its value by one array_key_exists( $words, $freqData ) ? $freqData[ $words ]++ : $freqData[ $words ] = 1; } $list = ''; arsort($freqData); foreach ($freqData as $word=>$count){ if ($count > 2){ $list .= "$word "; } } if (empty($list)){ $list = "Not enough duplicate words for popularity contest."; } echo $list; ?> 
+3
source share

I see that you accepted the answer, but I want to give you an alternative that can be more flexible in a certain sense: (Decide for yourself :-)) I have not tested the code, but I think you get the image. $ dbh is the PDO connection object. Then you need what you want to do with the resulting array of words $.

 <?php $words = array(); $tableName = 'party'; //The name of the table countWordsFromTable($words, $tableName) $tableName = 'party2'; //The name of the table countWordsFromTable($words, $tableName) //Example output array: /* $words['word'][0] = 'happy'; //Happy from table party $words['wordcount'][0] = 5; $words['word'][1] = 'bulldog'; //Bulldog from table party2 $words['wordcount'][1] = 15; $words['word'][2] = 'pokerface'; //Pokerface from table party2 $words['wordcount'][2] = 2; */ $maxValues = array_keys($words, max($words)); //Get all keys with indexes of max values of $words-array $popularIndex = $maxValues[0]; //Get only one value... $mostPopularWord = $words[$popularIndex]; function countWordsFromTable(&$words, $tableName) { //Get all fields from specific table $q = $dbh->prepare("DESCRIBE :tableName"); $q->execute(array(':tableName' = > $tableName)); $tableFields = $q->fetchAll(PDO::FETCH_COLUMN); //Go through all fields and store count of words and their content in array $words foreach($tableFields as $dbCol) { $wordCountQuery = "SELECT :dbCol as word, LENGTH(:dbCol) - LENGTH(REPLACE(:dbCol, ' ', ''))+1 AS wordcount FROM :tableName"; //Get count and the content of words from every column in db $q = $dbh->prepare($wordCountQuery); $q->execute(array(':dbCol' = > $dbCol)); $wrds = $q->fetchAll(PDO::FETCH_ASSOC); //Add result to array $words foreach($wrds as $w) { $words['word'][] = $w['word']; $words['wordcount'][] = $w['wordcount']; } } } ?> 
0
source share

All Articles