If A is csr_matrix , you can use .toarray() (there is also .todense() , which creates a numpy matrix , which also works for the DataFrame constructor):
df = pd.DataFrame(A.toarray())
Then you can use this with pd.concat() .
A = csr_matrix([[1, 0, 2], [0, 3, 0]]) (0, 0) 1 (0, 2) 2 (1, 1) 3 <class 'scipy.sparse.csr.csr_matrix'> pd.DataFrame(A.todense()) 0 1 2 0 1 0 2 1 0 3 0 <class 'pandas.core.frame.DataFrame'> RangeIndex: 2 entries, 0 to 1 Data columns (total 3 columns): 0 2 non-null int64 1 2 non-null int64 2 2 non-null int64
pandas version 0.20 introduced sparse data structures , including SparseDataFrame .
Alternatively, you can pass sparse sklearn matrices to avoid sklearn out of memory when accessing pandas . Just convert your other data to a sparse format by passing the numpy array to the scipy.sparse.csr_matrix constructor and use scipy.sparse.hstack to combine (see docs ).
source share