Firstly, I am not an expert, but I did something, and here is what I found:
I compiled the code using -print (because the JavaDecompiler for some reason did not work), which prints a program with all Scala-specific functions. There I saw:
test.this.anRDD().filter({ (new anonymous class anonfun$1(): Function1) }).flatMap({ (new anonymous class anonfun$2(): Function1) }, ClassTag.apply(classOf[scala.Tuple2]));
You will notice filter ... therefore, I checked on anonfun$1 :
public final boolean apply(Tuple2<String, List<Object>> check$ifrefutable$1) { Tuple2 localTuple2 = check$ifrefutable$1; boolean bool; if (localTuple2 != null) { bool = true; } else { bool = false; } return bool; }
So, if you put it all together, it seems that filter happening in understanding, because it filters out everything that is NOT Tuple2 .
And, it is preferable to use withFilter instead of filter (not sure why atm). You can see that by decompiling a regular list instead of RDD
object test { val regList = List( ("a", List(1, 2, 3)), ("b", List(4)), ("c", List(5, 6)) ) val foo = for { (someString, listOfInts) <- regList someInt <- listOfInts } yield (someString, someInt) }
What decompiles:
test.this.regList().withFilter({ (new anonymous class anonfun$1(): Function1) }).flatMap({ (new anonymous class anonfun$2(): Function1) }, immutable.this.List.canBuildFrom()).$asInstanceOf[List]();
So this is one and the same, except it uses withFilter where it can