Cannot start MapReduce job on hadoop 2.4.0

I am new to hadoop and here is my problem. I configured hasoop 2.4.0 with jdk1.7.60 on a cluster of 3 machines. I can execute all hadoop commands. Now I changed the wordcount example and created a jar file. I already ran this jar file on hadoop 1.2.1 and got the result. But now on hadoop 2.4.0 I am not getting any result.

The command used to execute

$hadoop jar WordCount.jar WordCount /data/webdocs.dat /output 

I get the following message from the installation:

 14/06/29 19:35:18 INFO client.RMProxy: Connecting to ResourceManager at /192.168.2.140:8040 14/06/29 19:35:18 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/29 19:35:19 INFO input.FileInputFormat: Total input paths to process : 1 14/06/29 19:35:19 INFO mapreduce.JobSubmitter: number of splits:12 14/06/29 19:35:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1403905542893_0004 14/06/29 19:35:19 INFO impl.YarnClientImpl: Submitted application application_1403905542893_0004 14/06/29 19:35:19 INFO mapreduce.Job: The url to track the job: http://192.168.2.140:8088/proxy/application_1403905542893_0004/ 14/06/29 19:35:19 INFO mapreduce.Job: Running job: job_1403905542893_0004 

At this point, the message does not change. I waited 15 to 20 minutes, but still the same.

This is what I see on the resource manager web page regarding the job:

 State - ACCEPTED FinalStatus - UNDEFINED Progress - (progress bar in 0%) Tracking UI - UNASSIGNED Apps Submitted - 1 Apps Pending - 1 Apps Running - 0 

I tried another yarn command to execute, but got the same result

 $yarn jar WordCount.jar WordCount /data/webdocs.dat /output 

Here is the result of jps:

 21485 NameNode 23142 DataNode 28504 Jps 21704 ResourceManager 22082 JobHistoryServer 

Any help or guidance would be greatly appreciated.

+8
java mapreduce hadoop yarn
source share
1 answer

I solved the problem. It was an error in the hadoop configuration file. There was a bind exception on port 8040 for the resourcemanager.

I changed hasoop yarn-site.xml from (old yarn-site.xml):

 <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>192.168.2.140:8025</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>192.168.2.140:8030</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>192.168.2.140:8040</value> </property> </configuration> 

To (new yarn .xml file):

 <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration> 

I deleted another line in hadoop configuration. Then I run the following commands to start resourcemanager and nodemanager

 $yarn-daemon.sh start nodemanager $yarn-daemon.sh start resourcemanager 

Then I tried to do my job, and it was successful.

+7
source share

All Articles