Failed to run the wordoount example?

I use an example using case-cap in a single node environment on ubuntu 12.04 in vmware. I run the example as follows: -

hadoop@master :~/hadoop$ hadoop jar hadoop-examples-1.0.4.jar wordcount /home/hadoop/gutenberg/ /home/hadoop/gutenberg-output 

I have an input file in the location below:

 /home/hadoop/gutenberg 

and output file location:

  /home/hadoop/gutenberg-output 

when I run the wordcount program, I get the following errors: -

  13/04/18 06:02:10 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:54310/home/hadoop/tmp/mapred/staging/hadoop/.staging/job_201304180554_0001 13/04/18 06:02:10 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /home/hadoop/gutenberg-output already exists org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /home/hadoop/gutenberg-output already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.j ava:137) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) at org.apache.hadoop.examples.WordCount.main(WordCount.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) hadoop@master :~/hadoop$ bin/stop- all.sh Warning: $HADOOP_HOME is deprecated. stopping jobtracker localhost: stopping tasktracker stopping namenode localhost: stopping datanode localhost: stopping secondarynamenode hadoop@master :~/hadoop$ 
+4
source share
4 answers

Delete the output file that already exists, or output to another file.

(I'm a little curious about what other interpretations of the error message you considered.)

+9
source

Like Dave (and exceptions), your output directory already exists. You either need to output to another directory, or first delete the existing one using:

 hadoop fs -rmr /home/hadoop/gutenberg-output 
+2
source

check if there is a 'tmp' folder or not.

hadoop fs -ls /

if you see that the output folder or "tmp" delete both (without any active jobs)

hadoop fs -rmr / tmp

+1
source

If you created your own .jar and are trying to run it, note:

To do your job, you had to write something like this:

 hadoop jar <jar-path> <package-path> <input-in-hdfs-path> <output-in-hdfs-path> 

But if you take a closer look at your driver code , you will see that you set arg[0] and arg[1] as the output as your input ... I will show this:

 FileInputFormart.addInputPath(conf, new Path(args[0])); FileOutFormart.setOutputPath(conf, new Path(args[1])); 

But hasoop accepts arg[0 ] as <package-path> instead of <input-in-hdfs-path> and arg [1] as <input-in-hdfs-path> instead of <output-in-hdfs-path>

So, to make it work, you should use:

 FileInputFormart.addInputPath(conf, new Path(args[1])); FileOutFormart.setOutputPath(conf, new Path(args[2])); 

With arg[1] and arg[2] , so it will get the right things! :) Hope this helps. Greetings.

+1
source

All Articles