"The job has not started yet" for Spark Job containing repartition ()

Question

After I scratched my head over “NO tasks have been started” to set pyspark for some time, the problem was isolated as:

Works:

ssc = HiveContext(sc) sqlRdd = ssc.sql(someSql) sqlRdd.collect()

Adding to repartition (), and it hangs "No jobs have started yet":

 ssc = HiveContext(sc) sqlRdd = ssc.sql(someSql).repartition(2) sqlRdd.collect()

It is at 1.2.0 complete with CDH5

+5

javadba Mar 07 '15 at 10:44

No one has answered this question yet.

See related questions:

68

2

1

0

0

0

0

0

0