There are 3 steps in the βDecreaseβ phase:
1) copy (data for gearboxes)
2) sort (or, more precisely, merging)
3) reduce (execution of the Reduce () function).
Gearboxes can start copying data from Mapper when this Mapper completes its execution.
By default, schedulers wait until 5% of the map tasks in the task are completed, planning reduces tasks for the same job. For large tasks, this can cause problems with the use of the cluster, as they take up a reduction in slots, waiting for the map tasks to complete. Setting mapred.reduce.slowstart.completed.maps to a larger value, such as 0.80 (80%), can help increase throughput.
source share