Cython: create memory without a NumPy array?

Since I think the memory representation is convenient and fast, I try to avoid creating NumPy arrays in cython and work with representations of these arrays. However, sometimes this cannot be avoided so as not to modify the existing array, but to create a new one. In the upper functions this is not noticeable, but in the often called subprograms it is. Consider the following function

#@cython.profile(False) @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) cdef double [:] vec_eq(double [:] v1, int [:] v2, int cond): ''' Function output corresponds to v1[v2 == cond]''' cdef unsigned int n = v1.shape[0] cdef unsigned int n_ = 0 # Size of array to create cdef size_t i for i in range(n): if v2[i] == cond: n_ += 1 # Create array for selection cdef double [:] s = np.empty(n_, dtype=np_float) # Slow line # Copy selection to new array n_ = 0 for i in range(n): if v2[i] == cond: s[n_] = v1[i] n_ += 1 return s 

Profiling tells me there is some kind of speed here to gain

I could adapt the function, because sometimes, for example, the average value of this vector is calculated, sometimes the sum. Thus, I could rewrite this to summarize or get the average. But is there no way to create a representation of memory with minimal overhead directly by dynamically determining the size . Something like this: first create a buffer c using malloc , etc., And at the end of the function, convert the buffer to a view by passing a pointer and stepping forward or so ..

Edit 1: Maybe for simple cases, adapting function e. gram. how is an acceptable approach. I just added the argument and summing / taking the average. This way, I don't need to create an array, and I can easily handle the malloc function. It won't be faster, right?

 # ... cdef double vec_eq(double [:] v1, int [:] v2, int cond, opt=0): # additional option argument ''' Function output corresponds to v1[v2 == cond].sum() / .mean()''' cdef unsigned int n = v1.shape[0] cdef int n_ = 0 # Size of array to create cdef Py_ssize_t i for i in prange(n, nogil=True): if v2[i] == cond: n_ += 1 # Create array for selection cdef double s = 0 cdef double * v3 = <double *> malloc(sizeof(double) * n_) if v3 == NULL: abort() # Copy selection to new array n_ = 0 for i in range(n): if v2[i] == cond: v3[n_] = v1[i] n_ += 1 # Do further computation here, according to option # Option 0 for the sum if opt == 0: for i in prange(n_, nogil=True): s += v3[i] free(v3) return s # Option 1 for the mean else: for i in prange(n_, nogil=True): s += v3[i] free(v3) return s / n_ # Since in the end there is always only a single double value, # the memory can be freed right here 
+9
source share
3 answers

I didn’t know how to deal with cpython arrays, so I finally solved this with a do-it-yourself “memory view” as suggested by fabrizioM . I would not think that this would work. Creating a new np.array in a narrow loop is quite expensive, so it made me much faster. Since I only need a 1-dimensional array, I didn't even have to worry about steps. But even for large dimensional arrays, I think this may go well.

 cdef class Vector: cdef double *data cdef public int n_ax0 def __init__(Vector self, int n_ax0): self.data = <double*> malloc (sizeof(double) * n_ax0) self.n_ax0 = n_ax0 def __dealloc__(Vector self): free(self.data) ... #@cython.profile(False) @cython.boundscheck(False) cdef Vector my_vec_func(double [:, ::1] a, int [:] v, int cond, int opt): # function returning a Vector, which can be hopefully freed by del Vector cdef int vecsize cdef size_t i # defs.. # more stuff... vecsize = n cdef Vector v = Vector(vecsize) for i in range(vecsize): # computation v[i] = ... return v ... vec = my_vec_func(... ptr_to_data = vec.data length_of_vec = vec.n_ax0 
+2
source

The following thread on the Cython mailing list is likely to be of interest to you:

https://groups.google.com/forum/#!topic/cython-users/CwtU_jYADgM

It seems that there are some decent options if you are okay with returning memory from your function, which is forced at some other level, where performance is not such a big problem.

0
source

From http://docs.cython.org/src/userguide/memoryviews.html it follows that memory for cython memory representations can be allocated via:

 cimport cython cdef type [:] cview = cython.view.array(size = size, itemsize = sizeof(type), format = "type", allocate_buffer = True) 

or

 from libc.stdlib import malloc, free cdef type [:] cview = <type[:size]> malloc(sizeof(type)*size) 

Both cases work, but at first I have problems if you enter your own type (ctypedef some mytype), because there is no suitable format for it. In the second case, there is a problem with freeing up memory.

From the manual, it should work as follows:

 cview.callback_memory_free = free 

which bind a function that frees memory for memory, however this code does not compile.

0
source

All Articles