I am trying to write Pig Script to compact small files that have data in parquet format. The lines below try to load small files into a directory and then save them. Files have complex nested structures that are nullable and contain a lot of NULLs.
LOGS = LOAD '/dt=20150307/hr=2015030700/*' USING parquet.pig.ParquetLoader();
STORE LOGS INTO '/user/compaction_output' USING parquet.pig.ParquetStorer();
I get the following error:
2015-04-29 17:00:45,883 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2118: Cannot build an empty group
My suspicion is that this is due to null values ββin the input files. Can anyone help?
source
share