Say I have temporary values ββfor specific users in text files, for example
#userid; unix-timestamp; value
1; 2010-01-01 00:00:00; 10
2; 2010-01-01 00:00:00; 20
1; 2010-01-01 01:00:00; 11
2; 2010-01-01 01:00:00, 21
1; 2010-01-02 00:00:00; 12
2; 2010-01-02 00:00:00; 22
I have a custom class "SessionSummary" that implements readFields and writes WritableComparable. The goal is to summarize all the values ββfor each user for each calendar day.
Thus, the cartographer displays the lines to each user, the reducer sums all the values ββper day per user and displays SessionSummary as TextOutputFormat (using toString of SessionSummary as UTF-8 lines separated by delimiters):
1; 2010-01-01; 21
2; 2010-01-01; 41
1; 2010-01-02; 12
2; 2010-01-02; 22
Map/Reduce, ? readFields write-methods ( WritableComparable), String DataInput? () :
public void map(...) {
SessionSummary ssw = new SessionSummary();
ssw.readFields(new DataInputStream(new ByteArrayInputStream(value.getBytes("UTF-8"))));
}
: Hadoop M/R, ?
( Hadoop - 0.20.2/CDH3u3)