IMHO, there is no trivial way (e.g. GROUP(DISTINCT a) in MySQL), so you need to split the table to do two counts for each row.
x = LOAD 'testdata' using PigStorage('^A') as (a,b,c,d); w1 = FOREACH x GENERATE a, CONCAT(b,c) AS bc; w2 = FOREACH x GENERATE a, d; v1 = DISTINCT w1; v2 = DISTINCT w2; u1 = GROUP v1 BY a; u2 = GROUP v2 BY a; t1 = FOREACH u1 GENERATE group AS a, COUNT(v1.bc); t2 = FOREACH u2 GENERATE group AS a, COUNT(v2.d); s = JOIN t1 BY a, t2 BY a;
UDF can greatly simplify this.
source share