Work to reduce the number of cards giving ClassNotFound exception, even though there is a cartographer when working with yarn?

Question

Work to reduce the number of cards giving ClassNotFound exception, even though there is a cartographer when working with yarn?

I run the hadoop task, which works fine when I run it without yarn in pseudo-distributed mode, but it gives me a class not found exception when working with yarn

16/03/24 01:43:40 INFO mapreduce.Job: Task Id : attempt_1458775953882_0002_m_000003_1, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.hadoop.keyword.count.ItemMapper not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
    at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.keyword.count.ItemMapper not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    ... 8 more

Here is the source code for job

Configuration conf = new Configuration();
conf.set("keywords", args[2]);

Job job = Job.getInstance(conf, "item count");
job.setJarByClass(ItemImpl.class);
job.setMapperClass(ItemMapper.class);
job.setReducerClass(ItemReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);

Here is the command I run

hadoop jar ~/itemcount.jar /user/rohit/tweets /home/rohit/outputs/23mar-yarn13 vodka,wine,whisky

Change the code after the sentence

package com.hadoop.keyword.count;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Mapper.Context;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;

public class ItemImpl {

    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();
        conf.set("keywords", args[2]);

        Job job = Job.getInstance(conf, "item count");
        job.setJarByClass(ItemImpl.class);
        job.setMapperClass(ItemMapper.class);
        job.setReducerClass(ItemReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }


    public static class ItemMapper extends Mapper<Object, Text, Text, IntWritable> {

        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        JSONParser parser = new JSONParser();

        @Override
        public void map(Object key, Text value, Context output) throws IOException,
                InterruptedException {

            JSONObject tweetObject = null;

            String[] keywords = this.getKeyWords(output);

            try {
                tweetObject = (JSONObject) parser.parse(value.toString());
            } catch (ParseException e) {
                e.printStackTrace();
            }
            if (tweetObject != null) {
                String tweetText = (String) tweetObject.get("text");

                if(tweetText == null){
                    return;
                }

                tweetText = tweetText.toLowerCase();
    /*          StringTokenizer st = new StringTokenizer(tweetText);

                ArrayList<String> tokens = new ArrayList<String>();

                while (st.hasMoreTokens()) {
                    tokens.add(st.nextToken());
                }*/

                for (String keyword : keywords) {
                    keyword = keyword.toLowerCase();
                    if (tweetText.contains(keyword)) {
                        output.write(new Text(keyword), one);
                    }
                }
                output.write(new Text("count"), one);
            }

        }

        String[] getKeyWords(Mapper<Object, Text, Text, IntWritable>.Context context) {

            Configuration conf = (Configuration) context.getConfiguration();
            String param = conf.get("keywords");

            return param.split(",");

        }
    }

    public static class ItemReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Context output)
                throws IOException, InterruptedException {

            int wordCount = 0;

            for (IntWritable value : values) {
                wordCount += value.get();
            }

            output.write(key, new IntWritable(wordCount));
        }
    }
}

+4

mapreduce hadoop

Dude Mar 24 '16 at 1:52

source share

3 answers

itemcount.jar? (jar -tvf itemcount.jar). , , .class .

0

SurjanSRawat 30 . '16 8:31

.

.
.
( i/o )
jar

hadoop jar ~/itemcount.jar/user/rohit/tweets/home/rohit//23mar-yarn13 , ,

hadoop jar ~/itemcount.jar com.hadoop.keyword.count.ItemImpl/user/rohit/tweets/home/rohit//23mar-yarn13 , ,

_ mainclass , .jar

Try-

try {
         tweetObject = (JSONObject) parser.parse(value.toString());
         } catch (Exception e) { **// Change ParseException to Exception if you don't only expect Parse error**
          e.printStackTrace();
         return; **// return from function in case of any error**
            }
}

public class ItemImpl extends Configured implements Tool{
public static void main (String[] args) throws Exception{
    int res =ToolRunner.run(new ItemImpl(), args);
    System.exit(res);
        }

    @Override
    public int run(String[] args) throws Exception { 

        Job job=Job.getInstance(getConf(),"ItemImpl ");
        job.setJarByClass(this.getClass());

        job.setJarByClass(ItemImpl.class);
        job.setMapperClass(ItemMapper.class);
        job.setReducerClass(ItemReducer.class);
        job.setMapOutputKeyClass(Text.class);//probably not essential but make it certain and clear
        job.setMapOutputValueClass(IntWritable.class); //probably not essential but make it certain and clear
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
 add public static map
 add public static reduce
 I'm not an expert about this topic but This  implementation is from one of my working projects. Try this if doesn't work for you I would suggest you check the libraries you added to your project.

Probably the first step will resolve it, but if these steps do not work, share the code with us.

0

Burak karasoy Mar 30 '16 at 10:41

source share

dev.glitch · Accepted Answer · 2016-04-02T15:14:32+0000

Running in fully distributed mode TaskTracker / NodeManager (working with your cartographer) is done in a separate JVM, and it looks like your class does not fall into this path to the JVM class.

-libjars <csv,list,of,jars> . Hadoop JVM TaskTracker . ( , node . , , Hadoop.)

yarn -jar ... hadoop -jar ..., / .

Work to reduce the number of cards giving ClassNotFound exception, even though there is a cartographer when working with yarn?

More articles: