Do you have to finish cards before cutting?

My university teacher said that Hadoop operations can only begin after all card operations are completed.

This contrasts with the output of the operation with a reduction in the size of the map, which sometimes shows:

map 80% reduce 13% map 80% reduce 27% and then map 100% reduce 27% . . map 100% reduce 100% 

(I have a map that reduces the three node clusters in my house, and I completed several streaming tasks).

What does the result mean, given that my lecturer knows what he is talking about? What is the state of work when the reduction began, but the card did not finish?

+6
source share
1 answer

There are 3 steps in the β€œDecrease” phase:

1) copy (data for gearboxes)

2) sort (or, more precisely, merging)

3) reduce (execution of the Reduce () function).

Gearboxes can start copying data from Mapper when this Mapper completes its execution.

By default, schedulers wait until 5% of the map tasks in the task are completed, planning reduces tasks for the same job. For large tasks, this can cause problems with the use of the cluster, as they take up a reduction in slots, waiting for the map tasks to complete. Setting mapred.reduce.slowstart.completed.maps to a larger value, such as 0.80 (80%), can help increase throughput.

+6
source

All Articles