Designing Swing Grouped Tuples

I have a set of tuples of the form (t, a, b) that I want to group by b on Pig. After grouping, I want to filter b from the tuples in each group and create a bag of filtered tuples for each group.

As an example, suppose we have (1,2,1) (2,0,1) (3,4,2) (4,1,2) (5,2,3)

The pig script will produce {(1,2), (2,0)} {(3,4), (4,1)} {(5,2)}

The question arises: how can I achieve a result? I'm used to seeing examples when aggregation operations follow a group. It’s less clear to me how to filter tuples and return them in a bag. Thank you for your help!

+8
apache-pig
source share
1 answer

It turns out that I was looking for syntax for nested projection in Pig.

If you have tuples of the form (t, a, b) and want to discard b after the group, this is done in this way.

grouped = GROUP tups BY b; result = FOREACH grouped GENERATE tup.(t,a); 

See the Nested Projection section on the PigLatin page. http://wiki.apache.org/pig/PigLatin

+8
source share

All Articles