I am writing something that takes a block of text and breaks it into possible database queries that can be used to find similar blocks of text. (something similar to a list of similar questions) when I print this) Main process:
- Remove stop words from text
- Removing Special Characters
- From the remaining text, create an array of unique "stems"
- Create an array of possible combinations of an array of stems (where I got stuck ... sort of)
Here is what I still have:
This works, but on blocks with text larger than 25 words, it resets my browser. I understand that mathematically there can be a huge number of possible combinations. I'd like to know:
- Is there a more efficient way to do this?
- How can I determine the length of the min / max array?
Hartyetech
source share