Portable / fast way to get a pointer to Numpy / Numpypy data

I recently tried PyPy and was intrigued by this approach. I have many C extensions for Python that use PyArray_DATA() to get a pointer to data sections of numpy arrays. Unfortunately, PyPy does not export the equivalent for its numpypy arrays to its cpyext module, so I tried following the recommendation on my website to use ctypes . This pushes the task of getting a pointer to the Python level.

There are two ways:

 import ctypes as C p_t = C.POINTER(C.c_double) def get_ptr_ctypes(x): return x.ctypes.data_as(p_t) def get_ptr_array(x): return C.cast(x.__array_interface__['data'][0], p_t) 

Only the second one works on PyPy, so the choice is clear for compatibility. For CPython, both are slow as hell and are the complete bottleneck for my application! Is there a quick and portable way to get this pointer? Or is there the equivalent of PyArray_DATA() for PyPy (possibly undocumented)?

+8
python api numpy pypy ctypes
source share
2 answers

I still have not found a completely satisfactory solution, but, nevertheless, you can do something to get a pointer with much less overhead in CPython. Firstly, the reason why both of the ways mentioned above are so slow is because both .ctypes and .__array_interface__ are on-demand attributes that are set by array_ctypes_get() and array_interface_get() in numpy/numpy/core/src/multiarray/getset.c . The first imports ctypes and creates an instance of numpy.core._internal._ctypes , and the second creates a new dictionary and populates it with a lot of unnecessary things in addition to the data pointer.

At the Python level, nothing can be done on this invoice, but at the C-level, you can write a micromodule that bypasses most of the service data:

 #include <Python.h> #include <numpy/arrayobject.h> PyObject *_get_ptr(PyObject *self, PyObject *obj) { return PyLong_FromVoidPtr(PyArray_DATA(obj)); } static PyMethodDef methods[] = { {"_get_ptr", _get_ptr, METH_O, "Wrapper to PyArray_DATA()"}, {NULL, NULL, 0, NULL} }; PyMODINIT_FUNC initaccel(void) { Py_InitModule("accel", methods); } 

Compile the extension in setup.py as usual and import as

 try: from accel import _get_ptr def get_ptr(x): return C.cast(_get_ptr(x), p_t) except ImportError: get_ptr = get_ptr_array 

PyPy from accel import _get_ptr will crash and get_ptr will return to get_ptr_array , which works with Numpypy.

In terms of performance, for lightweight C function calls, ctypes + accel._get_ptr() is still quite slower than the native CPython extension, which essentially has no overhead. This, of course, is much faster than get_ptr_ctypes() and get_ptr_array() above, so the overhead may not be significant for middle-weight C function calls.

One of them has become compatible with PyPy, although I have to say that having spent quite a bit of time evaluating PyPy for my scientific computing applications, I do not see the future for this until they (rather stubbornly) refuse to support the full API Interface CPython.

Update

I found that ctypes.cast() became a bottleneck after introducing accel._get_ptr() . You can get rid of the cast by declaring all pointers in the interface as ctypes.c_void_p . This is what I came across:

 def get_ptr_ctypes2(x): return x.ctypes._data def get_ptr_array(x): return x.__array_interface__['data'][0] try: from accel import _get_ptr as get_ptr except ImportError: get_ptr = get_ptr_array 

Here get_ptr_ctypes2() avoids the cast by directly accessing the hidden attribute ndarray.ctypes._data . Below are some temporary results for calling heavy and lightweight C functions from Python:

  heavy C (few calls) light C (many calls) ctypes + get_ptr_ctypes(): 0.71 s 15.40 s ctypes + get_ptr_ctypes2(): 0.68 s 13.30 s ctypes + get_ptr_array(): 0.65 s 11.50 s ctypes + accel._get_ptr(): 0.63 s 9.47 s native CPython: 0.62 s 8.54 s Cython (no decorators): 0.64 s 9.96 s 

So, with accel._get_ptr() and no ctypes.cast() s, ctypes speed really competes with the CPython internal extension. So I just need to wait for someone to rewrite h5py , matplotlib and scipy with ctypes to try PyPy for something serious ...

+4
source share

This may not be enough, but hopefully a good hint. I use scipy.weave.inline () in some parts of my code. I know little about the speed of the interface itself, because the function that I perform is quite heavy and depends only on a few pointers / arrays, but it seems to me that it is fast. Perhaps you can get inspiration from the scipy.weave code, especially from attempt_function_call

https://github.com/scipy/scipy/blob/master/scipy/weave/inline_tools.py#L390

If you want to see the C ++ code generated by scipy.weave,

0
source share

All Articles