Im sparking a python file that imports numpy, but Im getting a no module named numpy error.
$ spark-submit --py-files projects/other_requirements.egg projects/jobs/my_numpy_als.py Traceback (most recent call last): File "/usr/local/www/my_numpy_als.py", line 13, in <module> from pyspark.mllib.recommendation import ALS File "/usr/lib/spark/python/pyspark/mllib/__init__.py", line 24, in <module> import numpy ImportError: No module named numpy
I thought that I would draw an egg for a few pigment files, but itβs hard for me to figure out how to build this egg. But then it occurred to me that pyspark itself was using numpy. It would be foolish to pull my own version of numpy.
Any idea on what needs to be done here?
source share