Why is Stream.filter running out of memory?

These two expressions must mean the same thing:

Stream.from(1).filter(_ < 0).head Stream.from(1).find(_ < 0) 

The line should loop until Int.MinValue returns. And this is exactly what the version does with filter , but with find a OutOfMemoryError . However, looking at their implementations, I cannot understand that both versions do not raise an OutOfMemoryError .

Here is the implementation of Stream.filter :

 override def filter(p: A => Boolean): Stream[A] = { // optimization: drop leading prefix of elems for which f returns false // var rest = this dropWhile (!p(_)) - forget DRY principle - GC can't collect otherwise var rest = this while (!rest.isEmpty && !p(rest.head)) rest = rest.tail // private utility func to avoid `this` on stack (would be needed for the lazy arg) if (rest.nonEmpty) Stream.filteredTail(rest, p) else Stream.Empty } 

find inherits from LinearSeqOptimized , with this definition:

 override /*IterableLike*/ def find(p: A => Boolean): Option[A] = { var these = this while (!these.isEmpty) { if (p(these.head)) return Some(these.head) these = these.tail } None } 

Both of them have a while loop that discards Stream elements that do not satisfy the predicate. Since this must maintain a reference to the beginning of the Stream , all of these created elements must accumulate in memory until we finish the space. If I really don't understand what is happening here, Stream.filter somehow removes this from its stack frame before it enters the while loop. The comment in Stream.filter about why dropWhile not used looks like a hint, but I have no idea what this refers to.

The next step will be to learn to parse and read the JVM bytecode, but I really hope someone knows what is going on here.

+6
source share
1 answer

This is a combination of the HotSpot features and the Scala path.

If I disable HotSpot with -Xint , Stream.filter will also die with OutOfMemoryException . In the generated bytecode, this and the rest and these variables are stored in different memory cells, but since this is only used to initialize these variables, I believe that HotSpot is smart enough to just reuse the memory location for this . This explains why Stream.filter not exhausted.

Optimization of HotSpot to Stream.filter should also be applied to LinearSeqOptimized.find , however, due to the way the signs are implemented, a link to this maintained. When a method is implemented inside a trait, Scala compiles this method into a static method. When a class inherits this trait, Scala creates a small stub method that calls the static method. Thus, although HotSpot optimizes the static method for LinearSeqOptimized.find , the stack stack of the stub method still has a link to this .

+3
source

All Articles