Can my code improve the use of LINQ?

I have this code that works fine, but slow on large datasets.

I would like to hear from experts if this code can use Linq or another method, and if so, how?

Dim array_of_strings As String() ' now I add strings to my array, these come from external file(s). ' This does not take long ' Throughout the execution of my program, I need to validate millions ' of other strings. Dim search_string As String Dim indx As Integer ' So we get million of situation like this, where I need to find out ' where in the array I can find a duplicate of this exact string search_string = "the_string_search_for" indx = array_of_strings.ToList().IndexOf(search_string) 

Each of the lines in my array is unique, without duplicates.

This works very well, but, as I said, is too slow for large datasets. I run this query millions of times. It currently takes about 1 minute for a million requests, but it's too slow for me.

+6
source share
3 answers

No need to use Linq. If you used an indexed data structure, such as a dictionary, the search would be O (log n) due to the slightly longer process of filling the structure. But you do it once, then do a million searches, you are going to come forward.

See the description of the dictionary on this site: https://msdn.microsoft.com/en-us/library/7y3x785f(v=vs.110).aspx

Since (I think) you are talking about a collection, which is its own key, you can save some memory by using SortedSet<T> https://msdn.microsoft.com/en-us/library/dd412070(v=vs.110 ) .aspx

+5
source

No, I don’t think this could benefit from linq. Linq queries seem to be slow. However, you can try multithreading.

0
source

You can try using a DataTable that seems very fast:

 void Main() { var dt = new DataTable(); dt.Columns.Add("foo", typeof(string)); dt.Columns.Add("bar", typeof(string)); dt.Columns[0].Unique = true; dt.Rows.Add("baz", "baaz"); dt.Rows.Add("qux", "quux"); // add one million rows for (var i = 0; i < 1000000; i++) { dt.Rows.Add((i*2).ToString(), i); } var sw = Stopwatch.StartNew(); // select some arbitrary value var results = dt.Select("foo = '513916'"); // get its index dt.Rows.IndexOf(results.First()).Dump("Row index"); sw.Stop(); sw.Dump("Elapsed"); } 

Expired

(I can not translate it to VB)

0
source

All Articles