Cheapest way to get a numpy array in C-contiguous order?

As a result, an array of continuous numbers C is created:

import numpy a = numpy.ones((1024,1024,5)) 

Now, if I chop it, the result may not be the same. For example:

 bn = a[:, :, n] 

with n from 0 to 4. My problem is that I need bn be C-adjacent and I need to do this for many instances of a. I just need every bn once and you want to avoid

 bn = bn.copy(order='C') 

I also do not want to rewrite my code in such a way that

 a = numpy.ones((5,1024,1024)) 

Is there a faster and cheaper way to get bn than to make a copy?

Background:

I want to hash each fragment of each a using

 import hashlib hashlib.sha1(a[:, :, n]).hexdigest() 

Unfortunately, this will raise a ValueError , complaining about the order. Therefore, if there is another quick way to get the hash that I want, I would use it too.

+7
python arrays numpy
source share
3 answers

Be that as it may, any attempt to force the bn fragment into continuous order C will create a copy.

If you don’t want to change the shapes you start with (and don’t need a in C order), one of the possible solutions is to start with a array in Fortran order:

 >>> a = numpy.ones((1024, 1024, 5), order='f') 

Then the slices are also F-adjacent:

 >>> bn = a[:, :, 0] >>> bn.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False ... 

This means that the transposition of the slice bn will be in order C, and the transposition does not create a copy:

 >>> bn.T.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False ... 

And you can make a hash slice:

 >>> hashlib.sha1(bn.T).hexdigest() '01dfa447dafe16b9a2972ce05c79410e6a96840e' 
+4
source share

To make a numpy x array be C-contiguous, without having to make unnecessary copies when it already started, you should use

  x = numpy.asarray(x, order='C') 

Note that if this array was not C-contiguous, it would probably be similar in performance to x.copy(order='C') . I don’t think there is a way around. You cannot reorganize the alignment of an array in memory other than by copying a copy of the data to a new location.

Rewriting the code so that it uses the sliced ​​index first, as in numpy.ones((5,1024,1024)) , it seems to be the only reasonable way to optimize it.

+5
source share

This is a standard operation when interacting numpy with C. Look at numpy.ascontiguousarray

x=numpy.ascontiguousarray(x)

- the right way to deal with it.

Use numpy.asfortranarray if you need fortran order.

As already mentioned, the function will be copied if necessary. So there is no way around this. Before your work, you can try rollaxis so that the short axis is the first axis. This gives you an idea of ​​the array.

 In [2]: A=np.random.rand(1024,1024,5) In [3]: B=np.rollaxis(A,2) In [4]: B.shape Out[4]: (5, 1024, 1024) In [5]: B.flags Out[5]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [6]: A.flags Out[6]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False 

Thus, rollaxis also does not solve this.

+4
source share

All Articles