I am writing an application in Scala that uses Spark . I am packing my application using Maven and am having trouble creating a "uber" or "fat" jar .
The problem I am facing is that running the application works fine inside the IDE or if I provide a non-uber-jar'd version of the dependencies as the java class path, but this does not work if I give the uber jar as the class path , i.e.
java -Xmx2G -cp target/spark-example-0.1-SNAPSHOT-jar-with-dependencies.jar debug.spark_example.Example data.txt
does not work. The following error message appears:
ERROR SparkContext: Error initializing SparkContext. com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'akka.version' at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:124) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:145) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:151) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:159) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:164) at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:206) at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:168) at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:504) at akka.actor.ActorSystem$.apply(ActorSystem.scala:141) at akka.actor.ActorSystem$.apply(ActorSystem.scala:118) at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:122) at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54) at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982) at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56) at org.apache.spark.rpc.akka.AkkaRpcEnvFactory.create(AkkaRpcEnv.scala:245) at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:52) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:247) at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:188) at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267) at org.apache.spark.SparkContext.<init>(SparkContext.scala:424) at debug.spark_example.Example$.main(Example.scala:9) at debug.spark_example.Example.main(Example.scala)
I am very grateful for the understanding of what I need to add to the pom.xml file, and why I need to add it to make this work.
I searched on the Internet and found the following resources that I tried (see in pom) but could not work:
1) Spark User mailing list: http://apache-spark-user-list.1001560.n3.nabble.com/Packaging-a-spark-job-using-maven-td5615.html
2) how to pack a Scala spark application
I have a simple example demonstrating this problem, a simple class 1 project (src / main / scala / debug / spark_example / Example.scala):
package debug.spark_example import org.apache.spark.{SparkConf, SparkContext} object Example { def main(args: Array[String]): Unit = { val sc = new SparkContext(new SparkConf().setAppName("Test").setMaster("local[2]")) val lines = sc.textFile(args(0)) val lineLengths = lines.map(s => s.length) val totalLength = lineLengths.reduce((a, b) => a + b) lineLengths.foreach(println) println(totalLength) } }
Here is the pom.xml file:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>debug.spark-example</groupId> <artifactId>spark-example</artifactId> <version>0.1-SNAPSHOT</version> <inceptionYear>2015</inceptionYear> <properties> <scala.majorVersion>2.11</scala.majorVersion> <scala.minorVersion>.2</scala.minorVersion> <spark.version>1.4.1</spark.version> </properties> <repositories> <repository> <id>scala-tools.org</id> <name>Scala-Tools Maven2 Repository</name> <url>http://scala-tools.org/repo-releases</url> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>scala-tools.org</id> <name>Scala-Tools Maven2 Repository</name> <url>http://scala-tools.org/repo-releases</url> </pluginRepository> </pluginRepositories> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.majorVersion}${scala.minorVersion}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.majorVersion}</artifactId> <version>${spark.version}</version> </dependency> </dependencies> <build> <sourceDirectory>src/main/scala</sourceDirectory> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-eclipse-plugin</artifactId> <configuration> <downloadSources>true</downloadSources> <buildcommands> <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand> </buildcommands> <additionalProjectnatures> <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature> </additionalProjectnatures> <classpathContainers> <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer> <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer> </classpathContainers> </configuration> </plugin> <plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.4</version> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>attached</goal> </goals> </execution> </executions> <configuration> <tarLongFileMode>gnu</tarLongFileMode> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.2</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <minimizeJar>false</minimizeJar> <createDependencyReducedPom>false</createDependencyReducedPom> <artifactSet> <includes> <include>*:*</include> </includes> </artifactSet> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer> </transformers> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-surefire-plugin</artifactId> <version>2.7</version> <configuration> <skipTests>true</skipTests> </configuration> </plugin> </plugins> </build> <reporting> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> </plugin> </plugins> </reporting> </project>
Many thanks for your help.
scala maven akka apache-spark
br19
source share