I am trying to write a function to work with RDD [Seq [String]] objects, for example:
def foo(rdd: RDD[Seq[String]]) = { println("hi") }
This function cannot be called for objects of type RDD [Array [String]]:
val testRdd : RDD[Array[String]] = sc.textFile("somefile").map(_.split("\\|", -1)) foo(testRdd) -> error: type mismatch; found : org.apache.spark.rdd.RDD[Array[String]] required: org.apache.spark.rdd.RDD[Seq[String]]
I assume that since RDD is not covariant.
I tried a bunch of foo definitions to get around this. Only one of them amounted to:
def foo2[T[String] <: Seq[String]](rdd: RDD[T[String]]) = { println("hi") }
But it is still broken:
foo2(testRdd) -> <console>:101: error: inferred type arguments [Array] do not conform to method foo2 type parameter bounds [T[String] <: Seq[String]] foo2(testRdd) ^ <console>:101: error: type mismatch; found : org.apache.spark.rdd.RDD[Array[String]] required: org.apache.spark.rdd.RDD[T[String]]
Any idea how I can get around this? All this happens in the Spark shell.
types scala covariance apache-spark
user3666020
source share