Why does this method lead to an infinite loop?

One of my colleagues came to me with a question about this method, which leads to an infinite loop. The actual code is too confusing to post here, but essentially the problem boils down to the following:

private IEnumerable<int> GoNuts(IEnumerable<int> items) { items = items.Select(item => items.First(i => i == item)); return items; } 

This should (you think) just be a very inefficient way to create a copy of the list. I called it with:

 var foo = GoNuts(new[]{1,2,3,4,5,6}); 

The result is an endless loop. It’s strange.

I think the parameter change is stylistically bad, so I changed the code a bit:

 var foo = items.Select(item => items.First(i => i == item)); return foo; 

It worked. That is, the program is completed; not an exception.

Other experiments have shown that this also works:

 items = items.Select(item => items.First(i => i == item)).ToList(); return items; 

As simple

 return items.Select(item => .....); 

Curious.

It is clear that the problem is related to the reassignment of the parameter, but only if the evaluation is postponed beyond this statement. If I add ToList() , it will work.

I have a general, vague idea of ​​what is going wrong. It seems like Select is iterating over its own result. This is a bit strange in itself, because usually IEnumerable will throw if its collection iteration changes.

What I do not understand, because I am not very familiar with the internal functions of how this material works, is explained by why re-assigning the parameter causes this infinite loop.

Does anyone have more knowledge about the insides that would like to explain why there is an endless cycle?

+64
c # stack-overflow linq infinite-loop
Aug 13 '15 at 16:13
source share
3 answers

The key to answering this is deferred execution. When you do it

 items = items.Select(item => items.First(i => i == item)); 

you do not iterate over the items array passed to the method. Instead, you assign it a new IEnumerable<int> , which refers to itself and starts the iteration only when the caller begins to list the results.

This is why all your other fixes have raised the issue: all you have to do is stop feeding IEnumerable<int> back:

  • Using var foo breaks up self-esteem with another variable,
  • Using return items.Select... breaks self-esteem without using intermediate variables at all,
  • Using ToList() breaks self-consistency, avoiding delayed execution: by the time items reassigned to the old items , the iteration is complete, so you get the usual List<int> memory.

But if he eats by himself, how does he get anything at all?

That's right, nothing works! The moment you try to iterate through items and request it for the first item, the pending sequence requests the sequence that was submitted to it for the first item to process, which means the sequence requests itself for the first item to process. At the moment, he has completely omitted the turtle , because in order to return the first element for processing the sequence, you first need to get the first element for processing from yourself.

+63
Aug 13 '15 at 16:19
source share

It seems like Select iterates over its own output

You're right. You return a query that iterates on its own.

The key is that you reference items in lambda. The items link is not allowed ("closed") until the request is repeated, at which point items now refers to the request instead of the original collection. This is where self-esteem takes place.

Draw a deck of cards with a sign in front of it labeled items . Now depict a person standing next to a deck of cards whose purpose is to repeat a collection called items . But then you move the sign from the deck to the person. When you ask a person for the first "item" - he is looking for a collection with the inscription "items" - which is now his! Therefore, he asks himself about the first element in which the circular link occurs.

When you assign the result to a new variable, you then receive a query that iterates over another collection, and therefore does not lead to an infinite loop.

When you call ToList , you remove the request in a new collection, and also do not get an infinite loop.

Other things that would break the round link:

  • Moisturizing elements in lambda causing ToList
  • Assign items another variable and reference it in lambda.
+20
Aug 13 '15 at 16:17
source share

After studying the two answers that were given and a little bored, I came up with a small program that better illustrates the problem.

  private int GetFirst(IEnumerable<int> items, int foo) { Console.WriteLine("GetFirst {0}", foo); var rslt = items.First(i => i == foo); Console.WriteLine("GetFirst returns {0}", rslt); return rslt; } private IEnumerable<int> GoNuts(IEnumerable<int> items) { items = items.Select(item => { Console.WriteLine("Select item = {0}", item); return GetFirst(items, item); }); return items; } 

If you call this with:

 var newList = GoNuts(new[]{1, 2, 3, 4, 5, 6}); 

You will receive this output several times until you get a StackOverflowException .

 Select item = 1 GetFirst 1 Select item = 1 GetFirst 1 Select item = 1 GetFirst 1 ... 

What this shows is exactly what dasblinkenlight made clear in its updated answer: the request goes into an infinite loop, trying to get the first element.

Let's write GoNuts little differently:

  private IEnumerable<int> GoNuts(IEnumerable<int> items) { var originalItems = items; items = items.Select(item => { Console.WriteLine("Select item = {0}", item); return GetFirst(originalItems, item); }); return items; } 

If you run this, it will be successful. What for? Since in this case it is clear that the GetFirst call passes a reference to the source elements that were passed to the method. In the first case, GetFirst passes a link to a new collection of items that is not yet implemented. In turn, GetFirst says: "Hey, I need to list this collection." And so the first recursive call begins, which ultimately leads to a StackOverflowException .

Interestingly, I was right and wrong when I said that he was consuming his own result. Select consumes the original input, as you would expect. First tries to use the output.

There are many lessons here. For me, the most important thing is "do not change the value of the input parameters."

Thanks to dasblinkenlight, D Stanley and Lucas Trzesniewski for their help.

+5
Aug 13 '15 at 18:53
source share



All Articles