Spark SQL 1.2.0 queries return JavaRDD. Spark SQL 1.3.0 queries return a DataFrame. Converting a DataFrame to JavaRDD using DataFrame.toJavaRDD seems to take quite a while. I tried using DataFrame.map () and got a puzzling problem:
DataFrame df = sqlSC.sql(sql); RDD<String> rdd = df.map(new AbstractFunction1<Row, String> (){ @Override public String apply(Row t1) { return t1.getString(0); } }, ?);
"?" should be scala.reflect.ClassTag. I used ClassManifestFactory.fromClass (String.class) and it did not work. What should I put on "?".
By the way, the example given in the http://spark.apache.org/docs/1.3.0/sql-programming-guide.html section. Interaction with the RDD section of the Java Code is not fixed: He used "map (new Function () {" . "Function" is not acceptable here. Must be "Function1".
source share