Jar works with standalone Hadoop, but not on the cluster itself (java.lang.ClassNotFoundException: org.jfree.data.xy.XYDataset)

I am trying to create my project using Eclipse on Windows and run on a Linux cluster. The project depends on some external jars that I put in using the eclipse "Export-> Runnable JAR -> Package required library in jar" build. I checked that the jar contains classes in the folder structure and the external banks are in the root folder.

In the standalone mode of Hadoop, Cygwin, and Linux, this works fine, but on a real Hadoop Linux cluster it fails when it tries to access the class from the first external banner by throwing a ClassNotFoundException .

Is there a way to get Hadoop to look for a jar, I thought it would work.

 10/07/16 11:44:59 INFO mapred.JobClient: Task Id : attempt_201007161003_0005_m_000001_0, Status : FAILED Error: java.lang.ClassNotFoundException: org.jfree.data.xy.XYDataset at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at org.akintayo.analysis.ecg.preprocess.ReadPlotECG.plotECG(ReadPlotECG.java:27) at org.akintayo.analysis.ecg.preprocess.BuildECGImages.writeECGImages(BuildECGImages.java:216) at org.akintayo.analysis.ecg.preprocess.BuildECGImages.converSingleECGToImage(BuildECGImages.java:305) at org.akintayo.analysis.ecg.preprocess.BuildECGImages.main(BuildECGImages.java:457) at org.akintayo.hadoop.HadoopECGPreprocessByFile$MapTest.map(HadoopECGPreprocessByFile.java:208) at org.akintayo.hadoop.HadoopECGPreprocessByFile$MapTest.map(HadoopECGPreprocessByFile.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 
+4
source share
2 answers

Java cannot use banks that are in another bank: / (classloaders cannot handle this)

So, you need to install these packages separately on each machine in the cluster or, if this is not possible, add jars to the run, for this you need to add the -libjars mylib.jar option when running hadoop jar myjar.jar -libjars mylib.jar , and this should work.

+3
source

Wojtek's answer is correct. Using -libjars , you put your external banks in a distributed cache and make them available to all of your Hadoop nodes.

However, if your external banks do not change often, it may be more convenient for you to manually copy jar files to node hadoop / lib. After restarting Hadoop, your external jar will be added to the classpath of your assignments.

+1
source

Source: https://habr.com/ru/post/1315981/


All Articles