Output leaky exit warnings?

Pig: 0.8.1-cdh3u2 Hadoop: 0.20.2-cdh3u0 

Debugging FIELD_DISCARDED_TYPE_CONVERSION_FAILED warnings, but I cannot force individual warnings to print anywhere. Disabling aggregation via the -w or aggregate.warnings=false deletes the summary messages, but it also removes the actual warning, so I cannot see which type conversion failed.

Nothing is written in the pig's log in this swing, and I cannot find the logs with separate warnings. Am I missing something obvious or just not working?

+8
hadoop apache-pig
source share
2 answers

Hadoop job logs are written locally on every node calculation. Therefore, you first need to configure the hadoop cluster manager to collect log files in a distributed file system so that you can analyze them. If you are using Hadoop-on-demand ( http://hadoop.apache.org/docs/r0.17.0/hod.html ), you should do this by specifying something like:

 log-destination-uri = hdfs://host123:45678/user/hod/logs 

See the HOD documentation at http://hadoop.apache.org/docs/r0.17.0/hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs

Once you have the logs on HDFS, you can run a simple PIG request to find an abusive conversion. Something like the following should do the trick:

 a1= LOAD '*.log' USING PigStorage(']') ; a2= FILTER a1 by ($1 MATCHES ' WARN.*Unable to interpret value.*'); dump a2; 
0
source share

It's hard to find which data or value is causing the problem, but at least you can find which column is causing this problem. When you find the column, you can use Dynamic Invoker , which can help you with type conversion.

How to use Dynamic Invoker:
DEFINE ConvertToDouble InvokeForDouble ('java.lang.Double.parseDouble', 'String');

ConvertToDouble (column_name);

0
source share

All Articles