Which collection should I use?

Question

Which collection should I use?

I have about 10,000 entries. Each record has 2 fields: one field contains a string up to 300 characters long, and another field contains a decimal. It is like a product catalog with product names and the price of each product.

What I need to do is let the user enter any word and display all products containing this word, along with their prices in the list. It's all.

What type of collection is best for this scenario?
If I need to sort based on product name or price, will the selection still remain the same?

I am using an XML file now, but I thought using the collection so that I could insert all the values into the code was easier. Thanks for your suggestions.

+7

collections c # xml

user763554 Dec 24 '11 at 8:08

source share

2 answers

10K records are not many.

An Dictionary<string,decimal> corresponds to the size of the score. You can sort by key or value using LINQ, and also search.

This suggests that product names are unique.

+9

Odded Dec 24 '11 at 8:12

source share

Tim medora · Accepted Answer · 2011-12-24T08:24:53+0000

The dictionary will do the job. However, if you perform quick partial matches (for example, searching by user type), you can get better performance by creating multiple keys that point to the same element. For example, the word “Apple” may be located with “Ap,” “App,” “Appl,” and “Apple.”

I used this approach for a similar number of records with very good results. I turned 10K source articles into 50K unique keys. Each of these dictionary entries points to a list containing links to all matches for that term. You can then search more efficiently in this smaller list. Despite the large number of lists it creates, the amount of memory is quite reasonable.

You can also create your own keys if you want to redirect common spelling errors or point out related items. It also fixes most problems with unique keys, as each key points to a list. Each element can be classified by each of the words in its name; this is extremely useful if you have long product names with a few words in it. When classifying your positions, each word in the name can be mapped to one or more keys.

I should also point out that creating and classifying 10K elements should not take long if everything is done correctly (a couple of hundred milliseconds is reasonable). Results can be cached as long as you want to use Application , Cache or static elements.

To summarize, the resulting structure is a Dictionary<string, List<T>> , where the string is short (2-6 characters work well), but a unique key. Each key points to a List<T> (or other collection, if you are so inclined) of elements that match that key. When the search is done, you will find a key that matches the term provided by the user. Depending on the length of your keys, you can truncate the user's search to the maximum key length. After determining the correct child collection, you then look at this collection for a complete or partial match, using any methodology you want.

Finally, you can create a lightweight structure for each item in the list so that you can store additional information about the item. For example, you can create a small product class that stores the name, price, department, and popularity of a product. This can help you refine the results that you show to the user.

All in one, you can perform intelligent, detailed, fuzzy queries in real time.

The above structures should provide functionality roughly equivalent to trie .

Which collection should I use?

More articles: