I find the documentation a little misleading here, and when you work with Scala, you really see a warning like this:
... WARN SparkSession$Builder: Use an existing SparkSession, some configuration may not take effect.
This was more obvious before Spark 2.0 with a clear separation between contexts:
- Configuration
SparkContext cannot be changed at runtime. You must first stop the existing context. ConfigurationSQLContext can be changed at runtime.
spark.app.name , like many other parameters, is bound to a SparkContext and cannot be changed without stopping the context.
Reusing an existing SparkContext / SparkSession
import org.apache.spark.SparkConf import org.apache.spark.sql.SparkSession spark.conf.get("spark.sql.shuffle.partitions")
String = 200
val conf = new SparkConf() .setAppName("foo") .set("spark.sql.shuffle.partitions", "2001") val spark = SparkSession.builder.config(conf).getOrCreate()
... WARN SparkSession$Builder: Use an existing SparkSession ... spark: org.apache.spark.sql.SparkSession = ...
spark.conf.get("spark.sql.shuffle.partitions")
String = 2001
While spark.app.name config is being updated:
spark.conf.get("spark.app.name")
String = foo
it does not affect SparkContext :
spark.sparkContext.appName
String = Spark shell
Stop an existing SparkContext / SparkSession
Now stop the session and repeat the process:
spark.stop val spark = SparkSession.builder.config(conf).getOrCreate()
... WARN SparkContext: Use an existing SparkContext ... spark: org.apache.spark.sql.SparkSession = ...
spark.sparkContext.appName
String = foo
Interestingly, when we stop a session, we still get a warning about using an existing SparkContext , but you can verify that it is actually stopped.