If you want to ignore matrix zero-cost elements, the code below should work. It is also much faster than implementations that use the getrow method, which is rather slow.
from itertools import izip def sort_coo(m): tuples = izip(m.row, m.col, m.data) return sorted(tuples, key=lambda x: (x[0], x[2]))
For example:
>>> from numpy.random import rand >>> from scipy.sparse import coo_matrix >>> >>> d = rand(10, 20) >>> d[d > .05] = 0 >>> s = coo_matrix(d) >>> sort_coo(s) [(0, 2, 0.004775589084940246), (3, 12, 0.029941507166614145), (5, 19, 0.015030386789436245), (7, 0, 0.0075044957259399192), (8, 3, 0.047994403933129481), (8, 5, 0.049401058471327031), (9, 15, 0.040011608000125043), (9, 8, 0.048541825332137023)]
Depending on your needs, you can configure sort keys in lambda or continue processing the output. If you want everything in a dictionary with an indexed index you could:
from collections import defaultdict sorted_rows = defaultdict(list) for i in sort_coo(m): sorted_rows[i[0]].append((i[1], i[2]))
Alexander Measure
source share