Your task is impossible using only steps, but NumPy supports one kind of array that does the job. With steps and masked_array you can create the desired view for your data. However, not all NumPy Functions support masked_array operations, so scikit-learn may not handle them very well.
Let me first take a look at what we are trying to do here. Consider the input to your example. In essence, the data is just a 1-dimensional array in memory, and it is easier if we think about the steps with this. The array only looks like 2nd because we defined its shape. Using steps, you can define a form, for example:
from numpy.lib.stride_tricks import as_strided base = np.arange(9) isize = base.itemsize A = as_strided(base, shape=(3, 3), strides=(3 * isize, isize))
Now the goal is to set such steps to base so that it orders numbers, as at the end of the array, B In other words, we ask for integers a and B such that
>>> as_strided(base, shape=(4, 4), strides=(a, b)) array([[0, 1, 3, 4], [1, 2, 4, 5], [3, 4, 6, 7], [4, 5, 7, 8]])
But this is clearly impossible. The closest look we can achieve is how it is with the window upside down over the base :
>>> C = as_strided(base, shape=(5, 5), strides=(isize, isize)) >>> C array([[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]])
But the difference is that we have additional columns and rows that we would like to get rid of. Thus, effectively we ask which is not adjacent, and also makes jumps at regular intervals. In this example, we want to have every third element excluded from the window and jumps through one element after two lines.
We can describe this as masked_array :
>>> mask = np.zeros((5, 5), dtype=bool) >>> mask[2, :] = True >>> mask[:, 2] = True >>> D = np.ma.masked_array(C, mask=mask)
This array contains exactly the data that we want, and this is just a look at the source data. We can confirm that the data is equal
>>> D.data[~D.mask].reshape(4, 4) array([[0, 1, 3, 4], [1, 2, 4, 5], [3, 4, 6, 7], [4, 5, 7, 8]])
But, as I said at the beginning, it is likely that scikit-learn does not understand masked arrays. If it just converts this to an array, the data will be incorrect:
>>> np.array(D) array([[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]])