Creating substring combinations from a string

I am trying to create all possible syllable combinations for a given word. The process of determining that a syllable is not relevant here, but it generates all the combinations that give me the problem. I think it is possible to do this recursively in several lines that I think (although any other way is fine), but I am having problems with its operation. Can anyone help?

    // how to test a syllable, just for the purpose of this example
    bool IsSyllable(string possibleSyllable) 
    {
        return Regex.IsMatch(possibleSyllable, "^(mis|und|un|der|er|stand)$");
    }

    List<string> BreakIntoSyllables(string word)
    {
       // the code here is what I'm trying to write 
       // if 'word' is "misunderstand" , I'd like this to return
       //  => {"mis","und","er","stand"},{ "mis","un","der","stand"}
       // and for any other combinations to be not included
    }
+5
source share
3 answers

Try to start with this:

var word = "misunderstand";

Func<string, bool> isSyllable =
    t => Regex.IsMatch(t, "^(mis|und|un|der|er|stand)$");

var query =
    from i in Enumerable.Range(0, word.Length)
    from l in Enumerable.Range(1, word.Length - i)
    let part = word.Substring(i, l)
    where isSyllable(part)
    select part;

This returns:

misunderstand-results

Did it help from the start?


EDIT: I thought a little more about this problem, came up with this pair of queries:

Func<string, IEnumerable<string[]>> splitter = null;
splitter =
    t =>
        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        let e = t.Substring(n)
        from g in (new [] { new [] { e } }).Concat(splitter(e))
        select (new [] { s }).Concat(g).ToArray();

var query =
    from split in (new [] { new [] { word } }).Concat(splitter(word))
    where split.All(part => isSyllable(part))
    select split;

Now queryreturn this:

misunderstanding-results2

, .

+4

Tries. Trie trie # ( , ).

var trie = new Trie(new[] { "un", "que", "stio", "na", "ble", "qu", "es", "ti", "onable", "o", "nable" });
//var trie = new Trie(new[] { "u", "n", "q", "u", "e", "s", "t", "i", "o", "n", "a", "b", "l", "e", "un", "qu", "es", "ti", "on", "ab", "le", "nq", "ue", "st", "io", "na", "bl", "unq", "ues", "tio", "nab", "nqu", "est", "ion", "abl", "que", "stio", "nab" });

var word = "unquestionable";

var parts = new List<List<string>>();

Split(word, 0, trie, trie.Root, new List<string>(), parts);

//

public static void Split(string word, int index, Trie trie, TrieNode node, List<string> currentParts, List<List<string>> parts)
{   
    // Found a syllable. We have to split: one way we take that syllable and continue from it (and it done in this if).
    // Another way we ignore this possible syllable and we continue searching for a longer word (done after the if)
    if (node.IsTerminal)
    {
        // Add the syllable to the current list of syllables
        currentParts.Add(node.Word);

        // "covered" the word with syllables
        if (index == word.Length)
        {
            // Here we make a copy of the parts of the word. This because the currentParts list is a "working" list and is modified every time.
            parts.Add(new List<string>(currentParts));
        }
        else
        {
            // There are remaining letters in the word. We restart the scan for more syllables, restarting from the root.
            Split(word, index, trie, trie.Root, currentParts, parts);
        }

        // Remove the syllable from the current list of syllables
        currentParts.RemoveAt(currentParts.Count - 1);
    }

    // We have covered all the word with letters. No more work to do in this subiteration
    if (index == word.Length)
    {
        return;
    }

    // Here we try to find the edge corresponding to the current character

    TrieNode nextNode;

    if (!node.Edges.TryGetValue(word[index], out nextNode))
    {
        return;
    }

    Split(word, index + 1, trie, nextNode, currentParts, parts);
}

public class Trie
{
    public readonly TrieNode Root = new TrieNode();

    public Trie()
    {
    }

    public Trie(IEnumerable<string> words)
    {
        this.AddRange(words);
    }

    public void Add(string word)
    {
        var currentNode = this.Root;

        foreach (char ch in word)
        {
            TrieNode nextNode;

            if (!currentNode.Edges.TryGetValue(ch, out nextNode))
            {
                nextNode = new TrieNode();
                currentNode.Edges[ch] = nextNode;
            }

            currentNode = nextNode;
        }

        currentNode.Word = word;
    }

    public void AddRange(IEnumerable<string> words)
    {
        foreach (var word in words)
        {
            this.Add(word);
        }
    }
}

public class TrieNode
{
    public readonly Dictionary<char, TrieNode> Edges = new Dictionary<char, TrieNode>();
    public string Word { get; set; }

    public bool IsTerminal
    {
        get
        {
            return this.Word != null;
        }
    }
}

word - , parts (, List<string[]>, . parts.Add(new List<string>(currentParts)); parts.Add(currentParts.ToArray()); List<List<string>> List<string[]>.

Enigmativity, , , , , . , +1, . , . "" Trie (s): -)

Func<string, bool> isSyllable = t => Regex.IsMatch(t, "^(un|que|stio|na|ble|qu|es|ti|onable|o|nable)$");

Func<string, IEnumerable<string[]>> splitter = null;
splitter =
    t =>
        (
        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        where isSyllable(s)
        let e = t.Substring(n)
        let f = splitter(e)
        from g in f
        select (new[] { s }).Concat(g).ToArray()
        )
        .Concat(isSyllable(t) ? new[] { new string[] { t } } : new string[0][]);

var parts = splitter(word).ToList();

:

        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        where isSyllable(s)

, 1 - 1 , . . .

        let e = t.Substring(n)
        let f = splitter(e)

        from g in f
        select (new[] { s }).Concat(g).ToArray()

"" . , . , IEnumerable<IEnumerable<string>>, ToArray.

( , let,

        from g in splitter(t.Substring(n))
        select (new[] { s }).Concat(g).ToArray()

)

"" .

        .Concat(isSyllable(t) ? new[] { new string[] { t } } : new string[0][]);

, Concat , ( - isSyllable(t) ? new[] { new string[] { t } }.Concat(oldLambdaFunction) : oldLambdaFunction)

, , . Concat ( no Concat)

+3

, , , , , " ?". "syllable detection" 0 (n * n) , n = ( , , !), , ..

I know that you said that your process for determining whether or not a syllable is irrelevant, but allows you to change it so that it works more like autocomplete, i.e. convey the syllable at the beginning, and let it tell you that all syllables that are possible from now on will be much more scalable. Look at replacing it with trie if performance goes out of control.

0
source

All Articles