Common problems seem to seem strange, cannot throw an exception from BytesWritable to NullWritable. Another common problem is BytesWritable getBytes - an absolutely pointless bunch of nonsense that doesn't receive bytes at all. What getBytes does is get your bytes, which adds a ton of zeros at the end! You must use copyBytes
val rdd: RDD[Array[Byte]] = ??? // To write rdd.map(bytesArray => (NullWritable.get(), new BytesWritable(bytesArray))) .saveAsSequenceFile("/output/path", codecOpt) // To read val rdd: RDD[Array[Byte]] = sc.sequenceFile[NullWritable, BytesWritable]("/input/path") .map(_._2.copyBytes())
samthebest
source share