How to set hadoop input format for NLineInputFormat?

I am trying to limit the number of lines received by each of the Mappers. My code is as follows:

package com.iathao.mapreduce; import java.io.IOException; import java.net.MalformedURLException; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.lib.NLineInputFormat; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.regexp.RESyntaxException; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; public class Main { public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, RESyntaxException { try { if (args.length != 2) { System.err.println("Usage: NewMaxTemperature <input path> <output path>"); System.exit(-1); } Job job = new Job(); job.setJarByClass(Main.class); job.getConfiguration().set("mapred.max.map.failures.percent", "100"); // job.getConfiguration().set("mapred.map.max.attempts", "10"); //NLineInputFormat. .setNumLinesPerSplit(job, 1); job.setInputFormatClass(NLineInputFormat.class); 

In the last line in the example (job.setInputFormatClass (NLineInputFormat.class);) I get the following error:

 The method setInputFormatClass(Class<? extends InputFormat>) in the type Job is not applicable for the arguments (Class<NLineInputFormat>) 

Did I somehow get the wrong NLineInputFormat class?

+4
source share
1 answer

You are mixing old and new APIs.

import org.apache.hadoop.mapred.lib.NLineInputFormat;
import org.apache.hadoop.mapreduce.Job;

According to the "Hadoop: The Definitive Guide"

The new API is in the package org.apache.hadoop.mapreduce (and subpackages). The old API can still be found in org.apache.hadoop.mapred.

If you plan to use the new API, use the NLineInputFormat from the org.apache.hadoop.mapreduce package.

+9
source

All Articles