Difference between combiner and inline compiler in mapreduce?

I am new to hadoop and mapreduce. Can someone clarify the difference between the combiner and combiner in the cartographer, or are they the same thing?

+4
source share
1 answer

You may already know that a combiner is a process that runs locally on each Mapper to pre-aggregate data before it is shuffled across the network to various cluster gearboxes.

The in-mapper compiler performs this optimization a little further: the aggregates do not even write to the local disk: they are found inside the memory in the Mapper itself.

The in-mapper compiler does this using the setup () and cleanup () methods

org.apache.hadoop.mapreduce.Mapper

:

Map<LongWritable, Text> inmemMap = null
   protected void setup(Mapper.Context context) throws IOException, InterruptedException {
   inmemMap  = new Map<LongWritable, Text>();
 }

map() , ( context.write() . , Map/Reduce :

protected void cleanup(Mapper.Context context) throws IOException, InterruptedException {
  for (LongWritable key : inmemMap.keySet()) {
      Text myAggregatedText = doAggregation(inmemMap.get(key))// do some aggregation on 
                   the inmemMap.     
      context.write(key, myAggregatedText);
  }
}

, context.write() . cleanup() context.write(), / . ( ) .

- , - - . .

+4

All Articles