In my program code, I have arrays of numpy values ββand numpy arrays of indices. Both types are pre-distributed and predefined during program initialization.
Each part of the program has one values array on which calculations are performed, and three index arrays idx_from_exch , idx_values and idx_to_exch . In the global array of values ββthere is a value for exchanging the values ββof several parts: exch_arr .
Index arrays in most cases have from 2 to 5 indexes, rarely (most likely never), more indexes are not required. dtype=np.int32 , shape and values ββare constant during the entire program run. Thus, after initialization, I set ndarray.flags.writeable=False , but this is optional. The index values ββof the idx_values and idx_to_exch index arrays are sorted in numerical order, idx_source can be sorted, but there is no way to determine this. All index arrays corresponding to one value / part array have the same shape values.
Arrays of values , as well as exch_arr usually have from 50 to 1000 elements. shape and dtype=np.float64 are constant during the entire run of the program, the values ββof arrays change at each iteration.
Here are examples of arrays:
import numpy as np import numba as nb values = np.random.rand(100) * 100
Examples of indexing operations are as follows:
values[idx_values] = exch_arr[idx_from_exch] # get values from exchange array values *= 1.1 # some inplace array operations, this is just a dummy for more complex things exch_arr[idx_to_exch] = values[idx_values] # pass some values back to exchange array
Since these operations are applied once per iteration for several million iterations, speed is critical. I considered various ways to increase the indexing speed in my previous question , but forgot to be specific enough considering my application (especially getting values ββby indexing with a constant index of arrays and passing them to another indexed array).
For now, the best way to do this seems to be fancy indexing. I'm currently experimenting with numba guvectorize , but it seems like it's not worth the effort since my arrays are pretty small. memoryviews would be nice, but since index arrays don't necessarily have sequential steps, I don't know how to use memoryviews .
So, is there a faster way to re-index? Is there some way to predefine arrays of memory addresses for each indexing operation, since dtype and shape always constant? ndarray.__array_interface__ gave me the memory address, but I could not use it for indexing. I thought of something like:
stride_exch = exch_arr.strides[0] mem_address = exch_arr.__array_interface__['data'][0] idx_to_exch = idx_to_exch * stride_exch + mem_address
Is it possible?
I also studied using strides directly with as_strided , but as far as I know, only consistent steps are allowed, and my problem will require strides inconsistency.
Any help is appreciated! Thanks in advance!
change
I just fixed a massive error in my calculation example!
The operation values = values * 1.1 changes the memory address of the array. All my operations in the program code do not change the memory address of arrays, because many other operations depend on memory usage. Thus, I replaced the dummy operation with the correct in-place operation: values *= 1.1