An algorithm for creating all possible combinations of letters of a given string up to two letters

An algorithm for creating all possible combinations of letters of a given string up to two letters

Trying to create an Anagram solver in AS3, such as this one found here:

http://homepage.ntlworld.com/adam.bozon/anagramsolver.htm

I had a problem wrapping my brain around creating all possible letter combinations for different line lengths. If I only generated permutations for a fixed length, that would not be such a problem for me ... but I am looking to reduce the length of the string and get all possible permutations from the original set of letters for a string with a maximum length less than the original string. For example, let's say I want the string length to be 2, but I have a 3-line string "abc", the output will look like this: ab ac ba bc ca cb.

Ideally, the algorithm would create a complete list of possible combinations, starting from the original line length, up to the smallest line length 2. I have a feeling that for this there is probably a small recursive algorithm, but cannot wrap my brain around it. I work in AS3.

Thanks!

+7
algorithm actionscript-3 permutation
source share
5 answers

To write an anagram solution of the type you are binding, the algorithm you request is not required. It is also VERY expensive.

Let's look at a 6-letter word, for example, MONKEY . All 6 letters of the word are different, so you should create:

  • 6 * 5 * 4 * 3 * 2 * 1 different 6-letter words
  • 6 * 5 * 4 * 3 * 2 different 5-letter words
  • 6 * 5 * 4 * 3 different 4-letter words
  • 6 * 5 * 4 different three-letter words
  • 6 * 5 different two-word words
  • A total of 1950 words

Now, apparently, you are not trying to spit out all the words of 1950 (for example, "OEYKMN") in the form of anagrams (as they are, but most of them are also gibberish). I assume that you have a dictionary of legal English words, and you just want to check if any of these words are anagrams of the query word, with the possibility not to use all the letters.

If so, then the problem is simple.

To determine if 2 words are anagrams of each other, all you have to do is calculate how many times each letter is used and compare these numbers!

We restrict ourselves to only 26 letters, case insensitive. You need to write a countLetters function that takes a word and returns an array of 26 numbers. The first number in the array corresponds to the count of the letter A in the word, the second number corresponds to the count of B , etc.

Then the two words W1 and W2 are exact anagrams if countLetters(W1)[i] == countLetters(W2)[i] for every i ! That is, every word uses every letter exactly the same number of times!

For what I would call subanabrams ( MONEY is a subanagram of MONKEY ), W1 is a subanagram of W2 if countLetters(W1)[i] <= countLetters(W2)[i] for every i ! That is, a subanagram can use less than certain letters, but no more!

(note: MONKEY also a subanagram of MONKEY ).


This should give you a reasonably fast algorithm in which the query string is set, all you have to do is read the dictionary once, comparing the array of letter counters of each word with the array of letter counters of the query word. You can do a little optimization, but that should be good enough.

Alternatively, if you want to get maximum performance, you can pre-process the dictionary (which is known in advance) and create an oriented acyclic graph of the relationship between the anagrams.

Here is part of such a graph to illustrate:

  D=1,G=1,O=1 ----------> D=1,O=1 {dog,god} \ {do,od} \ \-------> G=1,O=1 {go} 

Basically, each node is a bucket for all words that have the same array of letter counts (i.e. they are exact anagrams). Then there is a node from N1 to N2 if the array N2 is <= (as defined above) N1 array (you can perform a transient reduction to store the least number of edges).

Then, to list all the meanings of the word, you need to find the node corresponding to its letter counting array and recursively examine all the nodes available from this node. All of their buckets contain subanagrams.

+7
source share

The following js code will find all possible β€œwords” in the n letter word. Of course, this does not mean that they are real words, but it gives you all the combinations. On my machine, it takes about 0.4 seconds for a 7-letter word and 15 seconds for a 9-letter word (up to almost a million possibilities if there are no duplicate letters). However, in those days you can find a dictionary in the dictionary and find what are real words.

 var getWordsNew=function(masterword){ var result={} var a,i,l; function nextLetter(a,l,key,used){ var i; var j; if(key.length==l){ return; } for(i=0;i<l;i++){ if(used.indexOf(""+i)<0){ result[key+a[i]]=""; nextLetter(a,l,key+a[i],used+i); } } } a=masterword.split(""); l=a.length; for (i = 0; i < a.length; i++) { result[a[i]] = ""; nextLetter(a, l, a[i], "" + i) } return result; } 

Complete code

Code for finding words in words

+3
source share

You need some kind of arrangement. If you are familiar with the permutation algorithm, you know that you have a check to see when you have created enough numbers. Just change this limit:

I do not know AS3, but here is the pseudo code:

 st = an array Arrangements(LettersInYourWord, MinimumLettersInArrangement, k = 1) if ( k > MinimumLettersInArrangements ) { print st; } if ( k > LettersInYourWord ) return; for ( each position i in your word that hasn't been used before ) st[k] = YourWord[i]; Arrangements(<same>, <same>, k + 1); 

for "abc" and "Ordering" (3, 2, 1); this will print:

 ab abc ac acb ... 

If you want the first three and then the two, consider this:

 st = an array Arrangements(LettersInYourWord, DesiredLettersInArrangement, k = 1) if ( k > DesiredLettersInArrangements ) { print st; return } for ( each position i in your word that hasn't been used before ) st[k] = YourWord[i]; Arrangements(<same>, <same>, k + 1); 

Then to call "abc" Arrangements(3, 3, 1); and then Arrangements(3, 2, 1);

0
source share

You can generate all the words in the alphabet by finding all the paths in the full letter schedule. You can find all the paths in this graph by doing the first depth search from each letter and returning the current path at each point.

0
source share

There is a simple O (N), where n is the size of the dictionary. Just sort the letters in each word in the dictionary or better, create a binary mask, and then compare the white letters that you have.

0
source share

All Articles