Can find or load the main class org.apache.nutch.crawl.InjectorJob

I use Linux with Hadoop, Cloudera and HBase.

Could you tell me how to fix this error?

Error: could to find or load main class org.apache.nutch.crawl.InjectorJob

The following command gave me an error:

 src/bin/nutch inject crawl/crawldb dmoz/ 

if you need any other information, ask me.

+8
hadoop solr nutch
source share
1 answer

I think you probably missed a step or two. Please confirm:

  • Did you install Apache ANT and then go to the nutch folder and type "ant"?
  • You set the environment variables:
    • NUTCH_JAVA_HOME: java implementation to use. Overrides JAVA_HOME .
    • NUTCH_HEAPSIZE: The maximum amount of heap to use in MB. The default value is 1000.
    • NUTCH_OPTS: advanced Java runtime options. Several parameters must be separated by a space.
    • NUTCH_LOG_DIR: log directory (default: $NUTCH_HOME/logs)
    • NUTCH_LOGFILE: log file (default: hadoop.log)
    • NUTCH_CONF_DIR: Path to configuration files (default: $NUTCH_HOME/conf) . Multiple paths must be separated by a colon ':'.
    • JAVA_HOME
    • NUTCH_JAVA_HOME
    • NUTCH_HOME

If you install using "ant", you will get a new folder in /nutch called /nutch/runtime/local , and this will be from where you should actually run nutch.

Tip. Try reading this page .

+1
source share

All Articles