Why is the Skip () function in LINQ not optimized?

var res = new int[1000000].Skip(999999).First(); 

It would be great if this query simply used an indexer instead of moving over 999,999 records.

I looked at the System.Core.dll file and noticed that, unlike Skip() , the Count() extension method is optimized. If IEnumerable implements ICollection , then it just calls the Count property.

+5
optimization linq linq-to-objects
source share
2 answers

If you look at my answer to a similar question, it seems that it would be easy to provide non-naive (i.e. correct exceptions) Skip optimizations for any IList :

  public static IEnumerable<T> Skip<T>(this IList<T> source, int count) { using (var e = source.GetEnumerator()) while (count < source.Count && e.MoveNext()) yield return source[count++]; } 

Of course, your example uses an array. Since arrays do not throw exceptions during iteration, even doing something as complex as my function would be unnecessary. Thus, we can conclude that MS did not optimize it because they did not think about it, or they did not think that this is a fairly common case that should be optimized.

+3
source share

I will let Jon Skeet answer this:

If our sequence is a list, we can simply skip it right on the right side and return the elements one at a time. That sounds great, but what if the list changes (or even truncated!) While we repeat it? An implementation that works with a simple iterator will usually throw an exception, because a change will invalidate the iterator. This is certainly a behavioral change. When I first wrote about Skip, I included this as a β€œpossible” optimization and actually included it in the Edulinq source code. Now I believe that this is a mistake, and completely deleted it.

...

The problem with both of these "optimizations," perhaps, is that they apply list-based optimization in the iterator block used for deferred execution. Optimization for lists upfront at the point of invocation of the initial method, as well as in the direct execution operator (Count, ToList, etc.) is excellent, since we assume that the sequence will not change during the execution of the method. We cannot make this assumption using the iterator block, because the code flow is very different: our code is visited repeatedly based on the use of the caller MoveNext ().

https://msmvps.com/blogs/jon_skeet/archive/2011/01/26/reimplementing-linq-to-objects-part-40-optimization.aspx

+3
source share

All Articles