I am doing a logical search system for some big no. of documents in which I made a hash dictionary, and entries in the dictionary are terms, and hashes contain documents in which this term was found. Now that I want to find a single word, I will just type in the word and I will index the dictionary using the entered word in the query and print the corresponding hashset. But I also want to look for sentences, in this case I would separate the request into separate words and index the dictionary with these words, now depending on the number of words in the request that a lot of hash sets will be returned, now I want to take the intersection of these sets of hashes, so that I can return the identifiers of the documents in which I recognized the words in the request. My question is the best way to intersect these hash sets?
Currently I put hash sets in a list, and then I take the intersection of these n no. hashes two at a time, and then the intersection of the result of the first two, and then the third, etc ....
This is the code
Dictionary<string, HashSet<string>> dt = new Dictionary<string, HashSet<string>>();//assume it is filled with data... while (true) { Console.WriteLine("\n\n\nEnter the query you want to search"); string inp = Console.ReadLine(); string[] words = inp.Split(new Char[] { ' ', ',', '.', ':', '?', '!', '\t' }); List<HashSet<string>> outparr = new List<HashSet<string>>(); foreach(string w in words) { HashSet<string> outp = new HashSet<string>(); if (dt.TryGetValue(w, out outp)) { outparr.Add(outp); Console.WriteLine("Found {0} documents.", outp.Count); foreach (string s in outp) { Console.WriteLine(s); } } } HashSet<string> temp = outparr.First(); foreach(HashSet<string> hs in outparr) { temp = new HashSet<string>(temp.Intersect(hs)); } Console.WriteLine("Output After Intersection:"); Console.WriteLine("Found {0} documents: ", temp.Count); foreach(string s in temp) { Console.WriteLine(s); } }
user2603796
source share