How can I access the Mapper / Reducer counters at the output stage?

I have several counters that I created in my Mapper class:

(example written using appengine-mapreduce v.0.5 Java library)

@Override public void map(Entity entity) { getContext().incrementCounter("analyzed"); if (isSpecial(entity)){ getContext().incrementCounter("special"); } } 

(The isSpecial method simply returns true or false depending on the state of the non-issue object)

I want to access these counters when I finish processing all the material using the finish method of the Output class:

 @Override public Summary finish(Collection<? extends OutputWriter<Entity>> writers) { //get the counters and save/return the summary int analyzed = 0; //getCounter("analyzed"); int special = 0; //getCounter("special"); Summary summary = new Summary(analyzed, special); save(summary); return summary; } 

... but the getCounter method is only available from the MapperContext class, which is only available from the Mappers / Reducers getContext() method.

How can I access my counters at the output stage?

Side note: I cannot send counter values ​​to my processed class, because the whole Map / Reduce is converting a set of objects to another set (in other words: counters are not the main purpose of Map / Reduce). Counters are for control only β€” I mean, they calculate them here, rather than creating another process to do the counts.

Thanks.

+2
source share
1 answer

Today there is no way to do this inside the release. But feel free to ask for it here: https://code.google.com/p/appengine-mapreduce/issues/list

However, you can link the task that will be performed after the reduction of your card, which will receive its output and counters. Here is an example of this: https://code.google.com/p/appengine-mapreduce/source/browse/trunk/java/example/src/com/google/appengine/demos/mapreduce/entitycount/ChainedMapReduceJob.java

In the above example, it runs 3 MapReduce jobs in a row. Note that this should not be a MapReduce job; you can create your own class that extends Job and has a run method that creates your Summary object.

0
source

All Articles