Part files are automatically processed as a set of files.
val data = sc.textFile("/path/to/my/file")
Just add a title and write it down:
val header = sc.parallelize(Seq("...header...")) val withHeader = header ++ data withHeader.saveAsTextFile("/path/to/my/modified-file")
Note that since this should read and write all the data, it will be a little slower than you can intuitively expect. (In the end, you just add one new row!) For this reason, it might be better for others not to add this header and instead store metadata (a list of columns) separately from the data.
source share