Is it possible to control the NO FILE error in Pig?

I am trying to upload a simple file:

log = load 'file_1.gz' using TextLoader AS (line:chararray); dump log 

And I get the error message:

 2014-04-08 11:46:19,471 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input Pattern hdfs://hadoop1:8020/pko/file*gz matches 0 files at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) 

Is it possible to manage this situation until an error occurs?

+6
source share
3 answers

Input template hdfs: // hadoop1: 8020 / pko / file * gz matches 0 files

The error is that the input file does not exist in the given hdfs path.

log = load 'file_1.gz' using TextLoader AS (string: chararray); since you did not specify the absolute path of file_1.gz, it will take the home hdfs directory of the user with whom you run your pig script

0
source

Unfortunately, in the current version of Pig (0.15.0), it is not possible to manage these errors without using UDF.

I suggest creating a Java or Python script using try and catch to take care of this.

Here is a good website you might find useful: https://wiki.apache.org/pig/PigErrorHandlingInScripts

Successful Pig Training!

0
source

I am also facing this problem. My boot command:

 DATA = LOAD '${qurwf_folder_input}/data/*/' AS (...); 

I want to download all files from data subfolders, but the data folder is empty, and I got the same error as you. In my specific case, I created an empty folder in the data directory. Therefore, LOAD returns an empty data set, and the script will not work.

By the way, I use the Oozie workflow to run scripts, and during the preparation process I create empty folders.

0
source

All Articles