How to start a spark interactively in cluster mode

I have a spark cluster running on

spark://host1:7077 spark://host2:7077 spark://host3:7077 

and connect via /bin/spark-shell --master spark://host1:7077 When you try to read the file using

 val textFile = sc.textFile("README.md") textFile.count() 

The prompt indicates

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

When checking through Web ui on host1:8080 it shows:

 Workers: 0 Cores: 0 Total, 0 Used Memory: 0.0 B Total, 0.0 B Used Applications: 0 Running, 2 Completed Drivers: 0 Running, 0 Completed Status: ALIVE 

My question is how to specify the kernel and memory when working in cluster mode with a spark shell? Or do I need to start packing my scala code into a .jar file and then send the job to a spark?

thanks

+5
source share
1 answer

Please package your code with jar and use it in your code

  String[] jars = new String[] { sparkJobJar }; sparkConf.setMaster("masterip"); sparkConf.set("spark.executor.memory", sparkWorkerMemory); sparkConf.set("spark.default.parallelism", sparkParallelism); JavaSparkContext ctx = new JavaSparkContext(sparkConf); 

Using spark.executor.memory, you can provide working memory, and Parallelism will help with the number of parallel tasks performed in the cluster.

you have the slaves file in .. / spark / conf, you need to put the ips from the slaves here.

please run the wizard on the main node / spark / sbin / start-master.sh

please run slave on the sub-nodes / spark / sbin / start-slaves.sh

+2
source

All Articles