How Enumerable.OrderBy uses keySelector

Using the current code, I managed to create a set of numbers and shuffle the position of the elements in the array:

var randomNumbers = Enumerable.Range(0, 100) .OrderBy(x => Guid.NewGuid()); 

Everything works fine, but in his work he tried to throw a wrench, trying to understand Enumerable.OrderBy . Take the following code, for example:

 var pupils = new[] { new Person() { Name = "Alex", Age = 17 }, new Person() { Name = "Jack", Age = 21 } }; var query = pupils.OrderBy(x => x.Age); 

I understand that I am passing the property that I want to sort, and I believe that LINQ will use Comparer<T>.Default to determine the order of the collection if no explicit IComparer is specified for the second overload. I really don't understand how any of these reasonable logics can be applied to shuffle an array in this way. So how does LINQ allow me to shuffle such an array?

+6
source share
3 answers

How does Enumerable.OrderBy use keySelector?

Enumerable.OrderBy<T> lazily returns - keySelector is not called directly. The result is an IOrderedEnumerable<T> , which will do the ordering on the enumeration.

When enumerating, keySelector is called once for each element. The order of the keys defines the new order of elements.

Here's a great sample implementation .


So how does LINQ allow me to shuffle such an array?

 var randomNumbers = Enumerable .Range(0, 100) .OrderBy(x => Guid.NewGuid()); 

Guid.NewGuid is called for each item. A call to the second item may generate a value above or below the call to the first item.

randomNumbers is an IOrderedEnumerable<int> that produces a different order with each enumeration. KeySelector is called once for each element each time randomNumbers listed.

+5
source

You are very close to understanding how such a shuffle works. In your second case

 pupils.OrderBy(x => x.Age); 

Comparer<int>.Default (faces are sorted by their Age , simple).

In your first case, Comparer<Guid>.Default .

How it works?

Each time you do Guid.NewGuid() (presumably), a different / original / non-duplicated Guid . Now when you do

 var randomNumbers = Enumerable.Range(0, 100).OrderBy(x => Guid.NewGuid()); 

numbers are sorted based on the created Guides.

Now what are guides?

They are 128-bit integers represented in hexadecimal. Since 2 ^ 128 is such a large number, the possibilities of generating two Guides are very rare / almost impossible. Because the Guides exhibit some kind of randomness, the ordering will also be random.

How do two guides compare in order?

You can confirm this based on a trivial experiment. At:

 var guids = Enumerable.Range(0, 10).Select((x, i) => { Guid guid = Guid.NewGuid(); return new { Guid = guid, NumberRepresentation = new BigInteger(guid.ToByteArray()), OriginalIndex = i }; }).ToArray(); var guidsOrderedByTheirNumberRepresentation = guids.OrderBy(x => x.NumberRepresentation).ToArray(); var guidsOrderedAsString = guids.OrderBy(x => x.Guid.ToString()).ToArray(); var randomNumbers = Enumerable.Range(0, 10).OrderBy(x => guids[x].Guid).ToArray(); //print randomNumbers.SequenceEqual(guidsOrderedByTheirNumberRepresentation.Select(x => x.OriginalIndex)) => false //print randomNumbers.SequenceEqual(guidsOrderedAsString.Select(x => x.OriginalIndex)) => true 

So, Comparer<Guid>.Default based on the string representation of guid.


Besides

You must use Fisher-Yates when shuffling speed. May be

 public static IEnumerable<T> Shuffle<T>(this IList<T> lst) { Random rnd = new Random(); for (int i = lst.Count - 1; i >= 0; i--) { int j = rnd.Next(i + 1); yield return lst[j]; lst[j] = lst[i]; } } 

Or for brevity it could be simple (which could be even faster than the Guid approach)

 public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> lst) { Random rnd = new Random(); return lst.OrderBy(x => rnd.Next()); } 
+3
source

So how does it work?

The following query uses Comparer<Guid>.Default for comparison.

  .OrderBy(x => Guid.NewGuid()) 

Since each generated GUID is almost unique (since you generate OrderBy in the expression itself), you assume that you are getting a random order (which is a misunderstanding). If you run the query again, you will again see the (supposedly) shuffled result when a new set of GUIDs is created.

If you use the predefined GUIDs, you will see the order.

Example randomNumbers1 and randomNumbers2 have the same values โ€‹โ€‹below.

 var randomGuids = Enumerable.Range(0,10).Select (x => Guid.NewGuid()).ToArray(); var randomNumbers1 = Enumerable.Range(0, 10).OrderBy(x => randomGuids[x]); var randomNumbers2 = Enumerable.Range(0, 10).OrderBy(x => randomGuids[x]); 

I really don't understand how any of these reasonable logics can be used to shuffle an array this way.

You can shuffle because there is no order between the elements (the GUID in your example). If you use ordered elements, you get ordered output instead of shuffled.

+2
source

All Articles