Failed to install pandas on AWS Lambda

I am trying to install and run pandas on an Amazon Lambda instance. I used to package my model_a.py code model_a.py and associated python libraries ( pip install pandas -t /path/to/dir/ ) and uploaded the zip to Lambda. When I try to run the test, this is the error message I get:

Failed to import module 'model_a': extension C: / var / task / pandas / hashtable.so: undefined symbol: PyFPE_jbuf not created. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace' to create C extensions first.

It looks like an error in the variable defined in hashtable.so that comes with the pandas installer. No relevant articles appeared on Google for this. There were some references to refusing to install numpy , but nothing concrete. Thank any help in resolving this issue! Thanks.

+6
source share
3 answers

Since the library you use requires your own libraries, you must also pack your own .so files with the layer. I encountered a similar problem when trying to run wkhtmltopdf on lambda-aws.

The binaries for the library must be compiled in the same environment as the lambda instance. Lambda boots using AWS Linux.

You can download EC2 running AmazonLinux or use Docker, the easiest way is to load the Docker container.

 $ sudo docker run -it amazonlinux bash 

Now you need to download / unzip all .so files to a directory, and then compress it. Also make sure all .so files are stored in a folder named lib inside zip. After archiving, it should look something like this:

 . β”œβ”€β”€ lib β”‚ β”œβ”€β”€ libcrypto.so.10 β”‚ β”œβ”€β”€ libcrypto.so.1.0.2k β”‚ β”œβ”€β”€ libfontconfig.so.1 β”‚ β”œβ”€β”€ libfontconfig.so.1.7.0 ....... 

Then you can simply compress it and load it as a layer. It will be loaded in / opt / in your Lambda container. AWS looks for library files in / opt / lib among many other places .

The challenge is for you to figure out how to get all the necessary .so files so that your dependency works correctly.

0
source

I have successfully run pandas code on lambda. If your development environment is not binary compatible with the lambda environment, you cannot just run pip install pandas -t /some/dir and pack it into the lambda .zip file. Even if you are developing for Linux, you may still run into compatibility issues .

So how do you deal with this? The solution is actually quite simple: run your pip install on the lambda container and use the pandas module that it loads / collects instead. When I did this, I had a build script that unwound the lambci / lambda container instance on my local system (AWS Lambda container clone in docker), linked my local build folder /build and run pip install pandas -t /build/ . After that, kill the container, and a lambda-compatible pandas module will appear in your local build folder, ready for archiving and sending to AWS along with the rest of the code.

You can do this for an arbitrary set of Python modules using the requirements.txt file, and you can even do it for arbitrary Python versions by first creating a virtual environment in the lambci container. I did not need to do this for a couple of years, so maybe there are better tools now, but this approach should at least be functional.

0
source

In AWS Lambda, you can only use pure-python libraries.

-1
source

All Articles