Why do we need to explicitly set the input / value class in a Hadoop program?

The book "Hadoop: The Definitive Guide" has an example program with the code below.

JobConf conf = new JobConf(MaxTemperature.class);  
conf.setJobName("Max temperature");  
FileInputFormat.addInputPath(conf, new Path(args[0]));  
FileOutputFormat.setOutputPath(conf, new Path(args[1]));  
conf.setMapperClass(MaxTemperatureMapper.class);  
conf.setReducerClass(MaxTemperatureReducer.class);  
conf.setOutputKeyClass(Text.class);  
conf.setOutputValueClass(IntWritable.class);  

The MR structure should be able to define the output class and value class from the Mapper and Reduce functions that are set in the JobConf class. Why do we need to explicitly set the output class and value class in the JobConf class? In addition, there is no similar API for a pair of input keys / values.

+5
source share
1 answer

- [1]. K/V generics. ( , ) .

k/v , SequenceFiles , . , SequenceFile, .

[1] http://download.oracle.com/javase/tutorial/java/generics/erasure.html

+7

All Articles