How to use heading (first line) as field names in Pig

Given the csv file with the first line, which can be taken as a header, how can I dynamically load field names in Pig using these headers? i.e.

id,year,total
1,1999,190
2,1998,20

a = LOAD '/path/to/file.csv' USING PigStorage() AS --use first row as field names
> describe a;
> id:bytearray,year:bytearray,total:bytearray 
+4
source share
1 answer

Since this is a CSV file, and you want to use the first line as a header, you should use one CSVLoader()for it. It will treat the first line as a header. Your script will be like this.

--Register the piggybank jar
REGISTER piggybank.jar
define CSVLoader org.apache.pig.piggybank.storage.CSVLoader();  

A = LOAD '/path/to/file.csv' using CSVLoader AS(id:int,year:chararray,total:int);
+1
source

All Articles