These options are really scattered everywhere.
In general, add your data files via --filesor --archivesand code files through --py-files. The latter will be added to the classpath (cf, here ) so you can import and use.
As you can imagine, CLI arguments are actually considered by functions addFileand addPyFiles(cf, here )
pyspark spark-submit script.
Python.zip,.egg .py , , , --py-files
--files --archives #, Hadoop. , : -files localtest.txt # appSees.txt, , localtest.txt HDFS, appSees.txt, appSees.txt, YARN.
addFile(path) , Spark node. , HDFS ( , Hadoop), HTTP, HTTPS FTP URI.
addPyFile(path) .py .zip , SparkContext . , HDFS ( , Hadoop), HTTP, HTTPS FTP URI.