Want to understand the best practices for handling exceptions in Mapper / Reducer.
Option 1: do not have attempts / catch and prevent the task from failing, and MR will repeat the task, which will eventually stop working. The mapreduce.map/reduce.maxattempts property plays a role here.
Option 2: Use counters to record the number of failures in the catch block. And based on some threshold value for these errors, either kill the task, or simply use counters to show the number of failed records.
Any (other) common / standard exception handling methods in map-reduce?
source
share