Phrase Separation Algorithm in PHP

I do not know how to explain it. Let me use an example. Say I want to split the sentence

"Today is a great day."

in

today today is today is a today is a great today is a great day is is a is a great is a great day a a great a great day great great day day 

The idea is to get the whole consistent combination in a sentence.

I thought the best way to do this is in PHP. Any idea is welcome.

+6
php
source share
4 answers

Here is an example:

 $sentence = 'Today is a great day.'; // Only leave "word" characters and whitespace $sentence = preg_replace('/[^\w\s]+/', '', strtolower($sentence)); // Tokenize $tokens = explode(' ', $sentence); for($i = 0; $i < count($tokens); $i++) { for($j = 1; $j <= count($tokens) - $i; $j++) { echo implode(' ', array_slice($tokens, $i, $j)) . "<br />"; } } 

Output:

 today today is today is a today is a great today is a great day is is a is a great is a great day a a great a great day great great day day 
+10
source share

divide it into an array of words using the php-function explode function. Then use two nested loops. External (i) goes through the indicators of the array (0..count (array) -1) and is located near the first word in the output line. The inner loop (j) goes from i + 1 to the length of the array. Then in the inner loop you should output the words from i to j-1. Use implode for this. Use it in a subarray of an array of words from i to j-1. You can get it with array_slice

0
source share

Recursive approach:

 function iterate($words) { if(($total = count($words)) > 0) { $str = ''; for($i = 0; $i < $total; $i++ ) { $str .= ' ' . $words[$i]; echo $str . PHP_EOL; } array_shift($words); iterate($words); } } $text = "Today is a great day."; $words = str_word_count($text, 1); iterate($words); 

The foregoing will only consider words. It will not remove duplicates. Numbers are not words, and punctuation is not. Given a five-word test sentence, the recursive approach is negligible faster than the array_splice solution. However, this increases significantly with each additional word. A quick test on my car with a ten-word sentence ended in almost half the time.


Disclaimer: Isolated tests depend on a number of factors and may produce different results on different machines. In any case, they can provide an indicator of code performance (often in the field of microrecovery), but nothing more.

0
source share
 $phrase = 'Today is a great day'; $pieces = explode(' ', strtolower($phrase)); $sets = array(); for ($i=0; $i<count($pieces);$i++) { for ($j=0; $j<count($pieces);$j++) { if ($i<=$j) $sets[$i][] = $pieces[$j]; } } print "<ul>"; foreach($sets as $set) { while(count($set) > 0) { print "<li>" . implode(' ', $set) . "</li>\n"; array_pop($set); } } print "</ul>"; 

Result:

  • Today is a great day.
  • Today it's great
  • Today it is
  • Today
  • Today
  • - excellent day.
  • - a great
  • is an
  • is an
  • excellent day
  • a great
  • but
  • excellent day
  • big
  • day
0
source share

All Articles