Embed a text index in C # memory

I have a performance sensitive task and am considering storing all objects that contain about 100,000 items in memory. (persistent in ms sql, but copy to memory to improve complex search performance)

Key search is fast enough, but text search, for example. Contains relatively slowly - it takes about 30 ms for each request:

IEnumerable<Product> result =
   products.Where(p =>
   p.Title.Contains(itemnames[rnd.Next(itemnames.Length)]));

I already tried to use the db4o memory database, but the performance is even worse - about 1.5 seconds to search in 100 thousand items.

What are the options for not reviewing every Title object and doing it faster?

What memory database can I use to solve this problem?

+5
source share
4 answers

Do you have the opportunity to change the data structure in which your products are stored? One way to speed up the Contains search is to store all possible substrings Product.Titlein Dictionary<string, List<Product>>. This will allow your search to be O (1) instead of O (n).

You can generate each substring as follows:

public static IEnumberable<string> AllSubstrings(this string value)
{
    int index = 0;
    while(++index <= value.Length)
    {
        yield return value.Substring(0, index);
    }

    index = 0;
    while(++index <= value.Length - 1)
    {
        yield return value.Substring(index);
    }
}

Then you can fill out your dictionary as follows:

var titleIndex = new Dictionary<string, List<Product>>();

foreach(Product product in products)
{
    foreach(string substring in product.Title.AllSubstrings())
    {
        if(titleIndex.ContainsKey(substring))
        {
            index[substring].Add(product);
        }
        else
        {
            index[substring] = new List<Product> { product };
        }
    }
}

And finally, you do your search like this:

string searchString = itemnames[rnd.Next(itemnames.Length)];

if(titleIndex.ContainsKey(searchString))
{
    List<Product> searchResults = titleIndex[searchString];
}

Note. . As you might have guessed, storing your data requires more processor time and more memory.

+2
source

, , . . .

SQLite .

0

SQLite, (FTS3).

0

All Articles