How to combine all the files in a directory on HDFS, which, as I know, are all compressed, into one compressed file without copying data through the local machine? For example, but not necessarily using Pig?
As an example, I have a folder / data / input containing the files part-m-00000.gz and part-m-00001.gz. Now I want to merge them into a single file / data / output / foo.gz
source
share