Bush request execution error, return code 3 from MapredLocalTask

Question

Bush request execution error, return code 3 from MapredLocalTask

I get this error when doing a simple join between two tables. I run this query on the Hive command line. I call table a and b. Table a is the internal Hive table, and b is the external table (in Cassandra). Table a has only 1610 rows, and table b has ~ 8 million rows. In the actual table of production scenarios, you can get up to 100 thousand lines. Below is my join with table b as the last table in the join

SELECT a.col1, a.col2, b.col3, b.col4 FROM JOIN b ON (a.col1 = b.col1 AND a.col2 = b.col2);

Below is the error

Total Jobs MapReduce = 1
Execution Log: /tmp/pricadmn/.log
2014-04-09 07:15:36 Start of launching a local task for processing map merging; maximum memory = 932184064
2014-04-09 07:16:41 String Processing: 200000 Hashtable Size: 199999 Memory Usage: 197529208 Percentage: 0.212
2014-04-09 07:17:12 String Processing: 300000 Hashtable Size: 299999 Memory Usage: 163894528 Percentage: 0.176
2014-04-09 07:17:43 String Processing: 400000 Hashtable Size: 399999 Memory Usage: 347109936 Percentage: 0.372
...
...
...

2014-04-09 07:24:29 String Processing: 1600000 Hashtable Size: 1599999 Memory Usage: 714454400 Percentage: 0.766
2014-04-09 07:25:03 String Processing: 1700000 Hashtable Size: 1699999 Memory Usage: 901427928 Percent: 0.977
Execution completed with an exit: 3
Getting Error Information

The task failed!
Task ID:
Stage-5

Magazines:

/u/applic/pricadmn/dse-4.0.1/logs/hive/hive.log
FAILED: Runtime error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

I am using DSE 4.0.1. Below are some of my settings that may interest you. Mapred.map.child.java.opts = -Xmx512M
mapred.reduce.child.java.opts = -Xmx512m
mapred.reduce.parallel.copies = 20
hive.auto.convert.join = true

I increased the mapred.map.child.java.opts file to 1G and I got a few more entries and then made a mistake. This does not seem like a good solution. I also changed the order in the connection, but didn't help. I saw this link Joining the Hive card: memory exception , but did not solve my problem.

It seems to me that Hive is trying to put a large table into memory during the local phase of the task, which I messed up. In my opinion, the second table (in my case, table b) should be passed. Correct me if I am wrong. Any help in solving this problem is much appreciated.

+8

hive cassandra-2.0 hiveql datastax-enterprise

user3517633 Apr 10 '14 at 2:52

source share

3 answers

Sahil nagpal · Answer 1 · 2014-10-31T09:43:08+0000

set hive.auto.convert.join = false;

Andrew Weaver · Answer 2 · 2014-04-10T18:52:47+0000

It looks like your task is running out of memory. Check line 324 of the MapredLocalTask class.

} catch (Throwable e) { if (e instanceof OutOfMemoryError || (e instanceof HiveException && e.getMessage().equals("RunOutOfMeomoryUsage"))) { // Don't create a new object if we are already out of memory return 3; } else {

alexliu68 · Answer 3 · 2014-04-10T15:52:26+0000

The last join should be the largest table. You can reorder join tables.

Bush request execution error, return code 3 from MapredLocalTask

More articles: