Spark's configuration system is a mess of environment variables, argument flags, and Java property files. I just spent a couple of hours tracking the same warning and solving the Spark initialization procedure, and here is what I found:
sbin/start-all.sh calls sbin/start-master.sh (and then sbin/start-slaves.sh )sbin/start-master.sh calls sbin/spark-daemon.sh start org.apache.spark.deploy.master.Master ...sbin/spark-daemon.sh start ... cancels the call to bin/spark-class org.apache.spark.deploy.master.Master ... , captures the resulting process identifier (pid), which sleeps for 2 seconds, and then checks is this the name of the pid command "java"bin/spark-class is a bash script, so it starts with the name of the command "bash" and goes to:- (re) load the Spark environment by looking for
bin/load-spark-env.sh - finds
java executable - finds the right can Spark
- calls
java ... org.apache.spark.launcher.Main ... to get the full class path needed to deploy Spark. - then finally passes control through
exec , java ... org.apache.spark.deploy.master.Master , after which the command name becomes "java"
If steps 4.1 to 4.5 take more than 2 seconds, which in my (and your) experience seems almost inevitable on a new OS, where java has never been run before, you will get a βfailed to startβ message even though nothing happened.
The slaves will complain for the same reason and shake until the master is available, but they must continue to try again until they can successfully connect to the master.
I have a fairly standard Spark deployment running on EC2; I use:
conf/spark-defaults.conf install spark.executor.memory and add some custom jars using spark.{driver,executor}.extraClassPathconf/spark-env.sh set SPARK_WORKER_CORES=$(($(nproc) * 2))conf/slaves to list my subordinates
Here's how I start Spark deployment, bypassing the {bin,sbin}/*.sh minefield / maze:
# on master, with SPARK_HOME and conf/slaves set appropriately mapfile -t ARGS < <(java -cp $SPARK_HOME/lib/spark-assembly-1.6.1-hadoop2.6.0.jar org.apache.spark.launcher.Main org.apache.spark.deploy.master.Master | tr '\0' '\n')
I still use sbin/start-daemon.sh to start slaves, as this is easier than calling nohup in the ssh command:
MASTER=spark://$(hostname -i):7077 while read -r; do ssh -o StrictHostKeyChecking=no $REPLY "$SPARK_HOME/sbin/spark-daemon.sh start org.apache.spark.deploy.worker.Worker 1 $MASTER" & done <$SPARK_HOME/conf/slaves
There! It assumes that I use all the default ports and stuff, and that I don't do stupid crap like putting spaces in file names, but I think it is cleaner.
source share