"Too many extraction errors" when using Hive

I am running a bush request with a 3-node hadoop cluster. And I get the error "Too many retrieval failures." My hive request:

  insert overwrite table tablename1 partition(namep)
  select id,name,substring(name,5,2) as namep from tablename2;

that im request is trying to run. All I want to do is transfer data from tablename2 to tablename1. Any help is appreciated.

+1
source share
1 answer

This can be caused by various hadoop configuration problems. Here is a couple to look for, in particular:

  • DNS problem: check /etc/hosts
  • Not enough HTTP streams on the card side for the reducer.

Some suggested fixes (from Cloudera troubleshooting)

  • set mapred.reduce.slowstart.completed.maps = 0.80
  • tasktracker.http.threads = 80
  • mapred.reduce.parallel.copies = sqrt (node count) but in any case >= 10

.

http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera

+1

All Articles