Spark: java.io.IOException: no space on device

Now I'm learning how to use spark.I have a piece of code that can invert the matrix, and it works when the order of the matrix is ​​small, like 100. But when the order of the matrix is ​​big, like 2000, I have an exception like this:

15/05/10 20:31:00 ERROR DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /tmp/spark-local-20150510200122-effa/28/temp_shuffle_6ba230c3-afed-489b-87aa-91c046cadb22

java.io.IOException: No space left on device

In my program, I have many lines like this:

val result1=matrix.map(...).reduce(...)
val result2=result1.map(...).reduce(...)
val result3=matrix.map(...)

(sorry, because the code is written for many there)

Therefore, I think that when I create this, Spark will create several new rdds, and in my program Spark will create too many rdds, so I have an exception. I'm not sure if I thought correctly.

How can I remove rdds that I will no longer use? Like result1 and result2?

I tried rdd.unpersist (), it does not work.

+4
source share
2 answers

, Spark shuffle temp /tmp . , .

spark-evn.sh.

SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark -Dhadoop.tmp.dir=/mnt/ephemeral-hdfs"

export SPARK_JAVA_OPTS
+7

Error message, , , . RDD, , reduce.

tmp

+2

All Articles