I am trying to integrate mongodb hadoop with spark, but I canโt figure out how to make banks accessible for IPython laptop.
Here is what I am trying to do:
# set up parameters for reading from MongoDB via Hadoop input format config = {"mongo.input.uri": "mongodb://localhost:27017/db.collection"} inputFormatClassName = "com.mongodb.hadoop.MongoInputFormat"
This code works fine when I run it in pyspark using the following command:
spark-1.4.1 / bin / pyspark -jars 'mongo-hadoop-core-1.4.0.jar, mongo-java-driver-3.0.2.jar'
where mongo-hadoop-core-1.4.0.jar
and mongo-java-driver-2.10.1.jar
allows you to use mongodb from java. However, when I do this:
IPYTHON_OPTS = "notebook" spark-1.4.1 / bin / pyspark - jars 'mongo-hadoop-core-1.4.0.jar, mongo-java-driver-3.0.2.jar'
Banks are no longer available, and I get the following error:
java.lang.ClassNotFoundException: com.mongb.hadoop.MongoInputFormat
Does anyone know how to make cans available for sparks in an IPython laptop? I am sure this does not apply to mongo, so maybe someone has already managed to add banks to the classpath when using a laptop?
source share