MapReduce Job does not show my print statements on terminal

Question

MapReduce Job does not show my print statements on terminal

I'm currently trying to find out when you start the MapReduce job, what happens by creating some system.out.println () file in certain places in the code, but you know that this print statement prints on my terminal when the job is running, Maybe someone help me figure out exactly what I'm doing wrong here.

import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.InputSplit; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.OutputCommitter; import org.apache.hadoop.mapreduce.RecordReader; import org.apache.hadoop.mapreduce.RecordWriter; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.StatusReporter; import org.apache.hadoop.mapreduce.TaskAttemptID; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCountJob { public static int iterations; public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { System.out.println("blalblbfbbfbbbgghghghghghgh"); StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); String myWord = itr.nextToken(); int n = 0; while(n< 5){ myWord = myWord+ "Test my appending words"; n++; } System.out.println("Print my word: "+myWord); word.set(myWord); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); TaskAttemptID taskid = new TaskAttemptID(); TokenizerMapper my = new TokenizerMapper(); if (args.length != 3) { System.err.println("Usage: WordCountJob <in> <out> <iterations>"); System.exit(2); } iterations = new Integer(args[2]); Path inPath = new Path(args[0]); Path outPath = null; for (int i = 0; i<iterations; ++i){ System.out.println("Iteration number: "+i); outPath = new Path(args[1]+i); Job job = new Job(conf, "WordCountJob"); job.setJarByClass(WordCountJob.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, inPath); FileOutputFormat.setOutputPath(job, outPath); job.waitForCompletion(true); inPath = outPath; } } }

+8

mapreduce hadoop

asembereng Jul 11 '11 at 3:09

source share

2 answers

Alternatively, you can use the MultipleOutputs class and redirect all log data to a single output file (log).

 MultipleOutputs<Text, Text> mos = new MultipleOutputs<Text, Text>(context); Text tKey = new Text("key"); Text tVal = new Text("log message"); mos.write(tKey, tVal, <lOG_FILE>);

+1

Pavan Apr 18 '15 at 14:50

source share

Thomas jungblut · Accepted Answer · 2011-07-11T05:41:04+0000

It depends on how you submit your work, I think you submit it using bin/hadoop jar yourJar.jar correctly?

Your System.out.println() is only available in your main method, i.e. due to the mapper / reducer running inside hadoop in another JVM, all outputs are redirected to special log files (out / log-files). And I would recommend using the native Apache-commons log using:

 Log log = LogFactory.getLog(YOUR_MAPPER_CLASS.class)

And therefore, do some logging of the information:

 log.info("Your message");

If you are in the "local" mode, you can see this log in your shell, otherwise this log will be stored somewhere on the machine where the task is performed. Please use the web interface of worktracker to view these log files, it is quite convenient. By default, the job tracker runs on port 50030.

MapReduce Job does not show my print statements on terminal

More articles: