Missing files randomly

I get random instances of files in random order that are not written to Spark.

15/12/29 17:30:26 ERROR server.TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=347837678000, chunkIndex=0}, buffer=FileSegmentManagedBuffer{file=/data/24/hadoop/yarn/local/usercache/root/appcache /application_1451416375261_0032/blockmgr-c2e951bb-856d-487f-a5be-2b3194fdfba6/1a/ shuffle_0_35_0.data, offset=1088736267, length=8082368}} to /10.7.230.74:42318; closing connection java.io.FileNotFoundException: /data/24/hadoop/yarn/local/usercache/root/appcache/application_1451416375261_0032/ blockmgr-c2e951bb-856d-487f-a5be-2b3194fdfba6/1a/shuffle_0_35_0.data (No such file or directory) at java.io.FileInputStream.open0(Native Method) ... 

It seems that most of the files in random order were written successfully, but not all of them.

This is the shuffling or reading phase of files in random order.

First, all artists can read files. In the end, and inevitably, one of the performers throws an exception above and leaves. Everyone else is starting to fail because they cannot receive these files randomly.

enter image description here

I have 40 GB of RAM in each artist, and I have 8 artists. This list is optional because of the remote artist after the failure. My data is big, but I see no problems with out of memory .

Any thoughts?


My repartition call has been repartition from 1000 partitions to 100,000 partitions, and now I get a new stack.

 Job aborted due to stage failure: Task 71 in stage 9.0 failed 4 times, most recent failure: Lost task 71.3 in stage 9.0 (TID 2831, dev-node1): java.io.IOException: FAILED_TO_UNCOMPRESS(5) at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84) at org.xerial.snappy.SnappyNative.rawUncompress(Native Method) at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:444) at org.xerial.snappy.Snappy.uncompress(Snappy.java:480) at org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:135) at org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:92) at org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:58) at org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:159) at org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:1179) at org.apache.spark.shuffle.hash.HashShuffleReader$$anonfun$3.apply(HashShuffleReader.scala:53) at org.apache.spark.shuffle.hash.HashShuffleReader$$anonfun$3.apply(HashShuffleReader.scala:52) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:173) at org.apache.spark.sql.execution.TungstenSort.org$apache$spark$sql$execution$TungstenSort$$executePartition$1(sort.scala:160) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$4.apply(sort.scala:169) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$4.apply(sort.scala:169) at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64) ... 
+8
yarn apache-spark
source share

No one has answered this question yet.

See related questions:

nine
jode DateTime format causes null pointer error in RDD spark functions
3
FileNotFoundException in apache spark (1.6) while playing files in random order
3
SparkStreaming + Kafka: Failed to get records after polling at 60,000
2
SBT test error: java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream
one
spark mllib image classification python
0
Spark com.databricks.spark.csv cannot load an instant compressed file using nodes
0
Wrong FS when loading json with spark from s3
0
Spark dataframe timestamp column displayed as InvalidType from the Mapr database table
0
Why does a spark throw an ArrayIndexOutOfBoundsException error for empty attributes?
-one
Is Apache Spark a way to automatically format / automatically convert a data type to / from known primitive formats?

All Articles