Refractory Spark Master

I am trying to run a simple Spark application offline on a Mac.

I managed to start ./sbin/start-master.shto start the master server and the worker.

./bin/spark-shell --master spark://MacBook-Pro.local:7077 also works and I see it in the list of running applications in Master WebUI

Now I'm trying to write a simple spark application.

import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.SparkContext._

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application")
                          .setMaster("spark://MacBook-Pro.local:7077")
    val sc = new SparkContext(conf)

    val logFile = "README.md"
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}

Running this simple application gives me an error message that the wizard is not responding

15/02/15 09:47:47 INFO AppClient$ClientActor: Connecting to master spark://MacBook-Pro.local:7077...
15/02/15 09:47:48 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@MacBook-Pro.local:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/02/15 09:48:07 INFO AppClient$ClientActor: Connecting to master spark://MacBook-Pro.local:7077...
15/02/15 09:48:07 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@MacBook-Pro.local:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/02/15 09:48:27 INFO AppClient$ClientActor: Connecting to master spark://MacBook-Pro.local:7077...
15/02/15 09:48:27 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@MacBook-Pro.local:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/02/15 09:48:47 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/02/15 09:48:47 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
15/02/15 09:48:47 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.

Any idea what the problem is? Thanks

+1
source share
3 answers

spark-submit ( ), SparkConf. Spark :

val conf = new SparkConf().setMaster("local[2]")

( , local): " , [2], - " "parallelism, , ."

+3

. , scala 2.11. Maven, :

build/mvn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

script scala 2.10. - scala Spark Client Master, , . , " ".

: 1. scala 2.11 (, scala 2.11). , SPARK_HOME:

dev/change-version-to-2.11.sh
mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package

SPARK_HOME/assembly/target/ scala -2.11. , start -all.sh, , .

  1. conf, spark-env.sh. , :

    export SPARK_SCALA_VERSION = "2.11"

  2. start-all.sh URL- . !

: . , , . conf log4j.properties.template log4j.properties. SPARK_HOME/logs.

+3

I write my code in JAVA, but I have the same problem with you. Since my version of scala is 2.10, my dependencies are 2.11. Then I changed spark-core_2.11 and spark-sql_2.11 to spark-core_2.10 and spark-sql_2.10 in pom.xml. Perhaps you can solve your problem in a similar way.

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>${spark.version}</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>${spark.version}</version>
</dependency>
0
source

All Articles