Fast Convert C / C ++ Vector to Numpy Array

I use SWIG to glue some C ++ code in Python (2.6), and part of this glue includes a piece of code that converts large data fields (millions of values) from C ++ to a Numpy array. The best method I can come up with is implementing an iterator for the class and then providing a Python method:

def __array__(self, dtype=float):
    return np.fromiter(self, dtype, self.size())

The problem is that every iterator call is nextvery expensive since it has to go through three or four SWIG wrappers. This is too long. I can guarantee that C ++ data is stored contiguously (since it lives in std :: vector) and it just feels that Numpy should be able to display a pointer to the beginning of this data along with the number of values ​​it contains and read it directly.

Is there a way to pass a pointer to internal_data_[0]and a value internal_data_.size()to numpy so that it can directly retrieve or copy data without all the Python overhead?

+5
source share
4 answers

So it seems like the only real solution is to create something from pybuffer.ithat can copy from C ++ to an existing buffer. If you add this to the SWIG include file:

%insert("python") %{
import numpy as np
%}

/*! Templated function to copy contents of a container to an allocated memory
 * buffer
 */
%inline %{
//==== ADDED BY numpy.i
#include <algorithm>

template < typename Container_T >
void copy_to_buffer(
        const Container_T& field,
        typename Container_T::value_type* buffer,
        typename Container_T::size_type length
        )
{
//    ValidateUserInput( length == field.size(),
//            "Destination buffer is the wrong size" );
    // put your own assertion here or BAD THINGS CAN HAPPEN

    if (length == field.size()) {
        std::copy( field.begin(), field.end(), buffer );
    }
}
//====

%}

%define TYPEMAP_COPY_TO_BUFFER(CLASS...)
%typemap(in) (CLASS::value_type* buffer, CLASS::size_type length)
(int res = 0, Py_ssize_t size_ = 0, void *buffer_ = 0) {

    res = PyObject_AsWriteBuffer($input, &buffer_, &size_);
    if ( res < 0 ) {
        PyErr_Clear();
        %argument_fail(res, "(CLASS::value_type*, CLASS::size_type length)",
                $symname, $argnum);
    }
    $1 = ($1_ltype) buffer_;
    $2 = ($2_ltype) (size_/sizeof($*1_type));
}
%enddef


%define ADD_NUMPY_ARRAY_INTERFACE(PYVALUE, PYCLASS, CLASS...)

TYPEMAP_COPY_TO_BUFFER(CLASS)

%template(_copy_to_buffer_ ## PYCLASS) copy_to_buffer< CLASS >;

%extend CLASS {
%insert("python") %{
def __array__(self):
    """Enable access to this data as a numpy array"""
    a = np.ndarray( shape=( len(self), ), dtype=PYVALUE )
    _copy_to_buffer_ ## PYCLASS(self, a)
    return a
%}
}

%enddef

then you can make the container "Numpy" -able with

%template(DumbVectorFloat) DumbVector<double>;
ADD_NUMPY_ARRAY_INTERFACE(float, DumbVectorFloat, DumbVector<double>);

Then in Python just do:

# dvf is an instance of DumbVectorFloat
import numpy as np
my_numpy_array = np.asarray( dvf )

This has only the overhead of a single Python ↔ C ++ translation, not N, which will be retrieved from a typical length-N array.

PyTRT github.

0

, f2py swig. , python C, Fortran. . http://www.scipy.org/Cookbook/f2py_and_NumPy

, numpy.

: Fortran, f2py ; , ++.

+1

If you transfer your vector to an object that implements the Pythons Buffer Interface , you can pass this to the numpy array for initialization (see docs , third argument). I would argue that this initialization is much faster, since it can simply be used memcpyto copy data.

0
source

All Articles