Bush request execution error, return code 3 from MapredLocalTask

I get this error when doing a simple join between two tables. I run this query on the Hive command line. I call table a and b. Table a is the internal Hive table, and b is the external table (in Cassandra). Table a has only 1610 rows, and table b has ~ 8 million rows. In the actual table of production scenarios, you can get up to 100 thousand lines. Below is my join with table b as the last table in the join

SELECT a.col1, a.col2, b.col3, b.col4 FROM JOIN b ON (a.col1 = b.col1 AND a.col2 = b.col2);

Below is the error

Total Jobs MapReduce = 1
Execution Log: /tmp/pricadmn/.log
2014-04-09 07:15:36 Start of launching a local task for processing map merging; maximum memory = 932184064
2014-04-09 07:16:41 String Processing: 200000 Hashtable Size: 199999 Memory Usage: 197529208 Percentage: 0.212
2014-04-09 07:17:12 String Processing: 300000 Hashtable Size: 299999 Memory Usage: 163894528 Percentage: 0.176
2014-04-09 07:17:43 String Processing: 400000 Hashtable Size: 399999 Memory Usage: 347109936 Percentage: 0.372
...
...
...

2014-04-09 07:24:29 String Processing: 1600000 Hashtable Size: 1599999 Memory Usage: 714454400 Percentage: 0.766
2014-04-09 07:25:03 String Processing: 1700000 Hashtable Size: 1699999 Memory Usage: 901427928 Percent: 0.977
Execution completed with an exit: 3
Getting Error Information


The task failed!
Task ID:
Stage-5

Magazines:

/u/applic/pricadmn/dse-4.0.1/logs/hive/hive.log
FAILED: Runtime error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

I am using DSE 4.0.1. Below are some of my settings that may interest you. Mapred.map.child.java.opts = -Xmx512M
mapred.reduce.child.java.opts = -Xmx512m
mapred.reduce.parallel.copies = 20
hive.auto.convert.join = true

I increased the mapred.map.child.java.opts file to 1G and I got a few more entries and then made a mistake. This does not seem like a good solution. I also changed the order in the connection, but didn't help. I saw this link Joining the Hive card: memory exception , but did not solve my problem.

It seems to me that Hive is trying to put a large table into memory during the local phase of the task, which I messed up. In my opinion, the second table (in my case, table b) should be passed. Correct me if I am wrong. Any help in solving this problem is much appreciated.

+8
hive hiveql datastax-enterprise
source share
3 answers

set hive.auto.convert.join = false;

+25
source share

It looks like your task is running out of memory. Check line 324 of the MapredLocalTask ​​class.

} catch (Throwable e) { if (e instanceof OutOfMemoryError || (e instanceof HiveException && e.getMessage().equals("RunOutOfMeomoryUsage"))) { // Don't create a new object if we are already out of memory return 3; } else { 
+1
source share

The last join should be the largest table. You can reorder join tables.

-one
source share

All Articles