Pyspark Adds an Artist Environment Variable

Is it possible to add a value to the PYTHONPATH of a worker in a spark?

I know that you can go to each node worker, configure spark-env.sh and do it, but I want a more flexible approach

I am trying to use the setExecutorEnv method, but without success

conf = SparkConf().setMaster("spark://192.168.10.11:7077")\ .setAppName(''myname')\ .set("spark.cassandra.connection.host", "192.168.10.11") / .setExecutorEnv('PYTHONPATH', '$PYTHONPATH:/custom_dir_that_I_want_to_append/') 

It creates a pythonpath env.variable for each artist, makes it be lower_case, and does not interpret the $ PYTHONPATH command to add a value.

I end up with two different env.variables,

 pythonpath : $PYTHONPATH:/custom_dir_that_I_want_to_append PYTHONPATH : /old/path/to_python 

The first is dynamically created, and the second already existed before.

Does anyone know how to do this?

+3
source share
1 answer

I understood myself ...

The problem is not in the spark, but in ConfigParser

Based on this answer , I fixed ConfigParser to always keep case-sensitive.

After that, I found that the default spark behavior is to add values ​​to the existing working env.variables, if there is an env.variable with the same name.

Thus, there is no need to mention $ PYTHONPATH in the dollar sign.

 .setExecutorEnv('PYTHONPATH', '/custom_dir_that_I_want_to_append/') 
+5
source

Source: https://habr.com/ru/post/1214554/


All Articles