How to pack a python script with dependencies in zip / tar?

I have a hadoop cluster that I use to analyze data using Numpy, SciPy and Pandas. I want my hadoop jobs to be represented as a zip / tar file using the '-file' argument to the command. This zip file should have EVERYTHING that my python program should execute so that no matter what node my script is running in the cluster, I will not run ImportError at runtime.

Due to company policy, installing these libraries on each node is not entirely feasible, especially for search / agile development. I have pip and virtualenv to create a sandbox as needed.

I looked through zipimport and python packaging , but none of this fits my needs / it's hard for me to use these tools.

Did anyone manage to do this? I can't seem to find success stories on the Internet.

Thank!

+4
source share
1 answer

I solved a similar problem in the context of Apache Spark and Python by creating a Docker image in which the python and spark slave script libraries were installed. The image is distributed to other machines, and when the container starts, it automatically joins the cluster, we only have one such image / host machine.

python zip , . , , .

, , ( ) python .

0

All Articles