Spark configuration priority

Is there a difference or priority between specifying the configuration of the spark application in the code:

SparkConf().setMaster(yarn)

and specifying them on the command line

spark-submit --master yarn
+4
source share
3 answers

Yes, the highest priority is assigned to the configuration in user code using the set () function. After that, the flags passed with sparks.

Properties set directly to SparkConf have the highest priority, then the flags are passed to spark-submit or spark-shell, and then to spark-defaults.conf. Several configuration keys have been renamed from earlier versions of Spark; in such cases, the old key names are still accepted, but have lower priority than any instance of the new key.

Source

+19

4 : ( 1 4, 1 - ):

  • SparkConf,
  • , spark-submit
  • .
+4

In addition to priority, by specifying it on the command line, you can run various cluster administrators without changing the code. The same application can be run on local [n] or yarn or meso or in an autonomous spark cluster.

+2
source

All Articles