How to store grouped records in multiple files using Pig?

After loading and grouping the records, how can I store these grouped records in several files, one per group (= userid)?

records = LOAD 'input' AS (userid:int, ...);
grouped_records = GROUP records BY userid;

I am using Apache Pig version 0.8.1-cdh3u3 (rexported)

+5
source share
2 answers
 A = LOAD 'mydata' USING PigStorage() as (a, b, c);  
 STORE A INTO '/my/home/output' USING MultiStorage('/my/home/output','0', 'bz2', '\\t');

Parameters:

  • parentPathStr - dir parent output path
  • splitFieldIndex - key field index
  • compression - 'bz2', 'bz', 'gz' or 'none'
  • fieldDel - field separator of the output record.

Link: GrepCode

+4
source

, MultiStorage Piggybank, , - ( "0" ):

STORE records INTO 'output' USING org.apache.pig.piggybank.storage.MultiStorage('output', '0', 'none', ',');
+8

All Articles