I am currently running a Java Spark application on tomcat and get the following exception:
Caused by: java.io.IOException: Mkdirs failed to create file:/opt/folder/tmp/file.json/_temporary/0/_temporary/attempt_201603031703_0001_m_000000_5
on the line
text.saveAsTextFile("/opt/folder/tmp/file.json") //where text is a JavaRDD<String>
The problem is that / opt / folder / tmp / already exists and successfully creates before /opt/folder/tmp/file.json/_temporary/0/, and then it comes across what looks like a permission problem with the rest of the path _temporary/attempt_201603031703_0001_m_000000_5 , but I gave the permissions of the tomcat user ( chown -R tomcat:tomcat tmp/ and chmod -R 755 tmp/ ) to the tmp / directory. Does anyone know what could happen?
thanks
Edit for @javadba:
[ root@ip tmp]
Exception in catalina.out
Caused by: java.io.IOException: Mkdirs failed to create file:/opt/folder/tmp/file.json/_temporary/0/_temporary/attempt_201603072001_0001_m_000000_5 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:799) at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123) at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1193) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more
java tomcat apache-spark spark-dataframe
DeeVu
source share