I have a list dictionary as follows (it can be more than 1M elements, it is also assumed that the dictionary is sorted by key)
import scipy.sparse as sp
d = {0: [0,1], 1: [1,2,3],
2: [3,4,5], 3: [4,5,6],
4: [5,6,7], 5: [7],
6: [7,8,9]}
I want to know what is the most efficient way (quick way for a large dictionary) to convert it to a list of row and column indexes, for example:
r_index = [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 6, 6, 6]
c_index = [0, 1, 1, 2, 3, 3, 4, 5, 4, 5, 6, 5, 6, 7, 7, 7, 8, 9]
Here are some solutions that I still have:
Using iteration
row_ind = [k for k, v in d.iteritems() for _ in range(len(v))] # or d.items() in Python 3
col_ind = [i for ids in d.values() for i in ids]
Using pandas library
import pandas as pd
df = pd.DataFrame.from_dict(d, orient='index')
df = df.stack().reset_index()
row_ind = list(df['level_0'])
col_ind = list(df[0])
Using itertools
import itertools
indices = [(x,y) for x, y in itertools.chain.from_iterable([itertools.product((k,), v) for k, v in d.items()])]
indices = np.array(indices)
row_ind = indices[:, 0]
col_ind = indices[:, 1]
I'm not sure how this is the fastest way to deal with this problem if there are many elements in my dictionary. Thank!