RDD spark saveAsTextFile gzip

Is it possible to save a text file with spark rdd as gzip?

Can I run it somehow: combPrdGrp3.repartition(10).saveAsTextFile("Combined") and save it as gzip files?

+4
apache-spark
source share
1 answer

using

 import org.apache.hadoop.io.compress.GzipCodec combPrdGrp3.repartition(10).saveAsTextFile("Combined", classOf[GzipCodec]) 

or

 sc.hadoopConfiguration.setClass(FileOutputFormat.COMPRESS_CODEC, classOf[GzipCodec], classOf[CompressionCodec]) 
+3
source share

All Articles