Error Solr dedup Error with output 255

Question

Error Solr dedup Error with output 255

I am browsing some data from the internet using apache nutch 2.3. My version of solr is 4.10.3. Data is successfully scanned in hbase and indexed also in solr, but at the end (release stage) a Follwoing error appears in the console;

IndexingJob: done. SOLR dedup -> http://solr:8983/solr /home/crawler/nutch-2.3/bin/nutch solrdedup -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true http://solr:8983/solr Error running: /home/crawler/nutch-2.3/bin/nutch solrdedup -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true http://solr:8983/solr Failed with exit value 255.

Where solr is the IP machine working with apache solr. In apache nutch log file the corresponding error (the following is in detail)

 2015-01-28 10:39:47,830 WARN mapred.FileOutputCommitter - Output path is null in cleanup 2015-01-28 10:39:47,830 WARN mapred.LocalJobRunner - job_local345700287_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:388) at org.apache.hadoop.io.Text.set(Text.java:178) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

What is the problem with nutch or solr? How to do it?

+7

java apache web-crawler solr nutch

Shafiq Jan 28 '15 at 5:53

source share

No one has answered this question yet.

See related questions:

one

Indexing Solr after traversing Nutch fails, reports "Indexer: java.io.IOException: Job failed!"

one

Shared data record cannot be transferred to Avro

one

elastic search on qbox is not available through nutch

0

Hadoop Pig Cassandra Error get_range_slices

0

Hadoop file receiving exception not found

0

NUTCH 1.13 URL selection failed: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url = http

0

OpenCV Libraries in Hadoop

0

Map key mismatch type: expected org.apache.hadoop.io.IntWritable received by org.apache.hadoop.io.LongWritable

0

nutch 1.2 solr 3.1 integration problem

0

Apache nutch in distributed mode is not going to be scanned from the Internet

Error Solr dedup Error with output 255

More articles: