I started reading about Big Data and Hadoop, so this question may seem very silly to you.
This is what I know.
Each handler processes a small amount of data and produces intermediate output. After that we have a random step and sort.
Now Shuffle = moving intermediate output to corresponding reducers, each of which deals with a specific key / keys.
So, can one Data Node have Mapper and Reducer code in it, or do we have different DNs for each?
source
share