Possible solution using stride_tricks . This is partly based on the wealth of information available in answering this question , but the problem, it seems to me, is simply not the same as a duplicate. Here's the basic idea applied to a square matrix - see below for a function that implements a more general solution.
>>> cols = 8 >>> a = numpy.arange(cols * cols).reshape((cols, cols)) >>> fill = numpy.zeros((cols - 1) * cols, dtype='i8').reshape((cols - 1, cols)) >>> stacked = numpy.vstack((a, fill, a)) >>> major_stride, minor_stride = stacked.strides >>> strides = major_stride, minor_stride * (cols + 1) >>> shape = (cols * 2 - 1, cols) >>> numpy.lib.stride_tricks.as_strided(stacked, shape, strides) array([[ 0, 9, 18, 27, 36, 45, 54, 63], [ 8, 17, 26, 35, 44, 53, 62, 0], [16, 25, 34, 43, 52, 61, 0, 0], [24, 33, 42, 51, 60, 0, 0, 0], [32, 41, 50, 59, 0, 0, 0, 0], [40, 49, 58, 0, 0, 0, 0, 0], [48, 57, 0, 0, 0, 0, 0, 0], [56, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 7], [ 0, 0, 0, 0, 0, 0, 6, 15], [ 0, 0, 0, 0, 0, 5, 14, 23], [ 0, 0, 0, 0, 4, 13, 22, 31], [ 0, 0, 0, 3, 12, 21, 30, 39], [ 0, 0, 2, 11, 20, 29, 38, 47], [ 0, 1, 10, 19, 28, 37, 46, 55]]) >>> diags = numpy.lib.stride_tricks.as_strided(stacked, shape, strides) >>> diags.sum(axis=1) array([252, 245, 231, 210, 182, 147, 105, 56, 7, 21, 42, 70, 105, 147, 196])
Of course, I have no idea how fast it really will be. But I'm sure it will be faster than understanding Python.
For convenience, there is a fully general diagonals function. It is assumed that you want to move the diagonal along the longest axis:
def diagonals(a): rows, cols = a.shape if cols > rows: a = aT rows, cols = a.shape fill = numpy.zeros(((cols - 1), cols), dtype=a.dtype) stacked = numpy.vstack((a, fill, a)) major_stride, minor_stride = stacked.strides strides = major_stride, minor_stride * (cols + 1) shape = (rows + cols - 1, cols) return numpy.lib.stride_tricks.as_strided(stacked, shape, strides)