Transport sparse matrix from Python to R

Question

Transport sparse matrix from Python to R

I am doing some text analysis work in Python. Unfortunately, I need to switch to R in order to use a specific package (unfortunately, a package cannot be replicated in Python easily).

Currently, the text is analyzed for the bigram number, reduced to a vocabulary of about 11,000 bigrams, and then saved as a dictionary:

{id1: {'bigrams':[(bigram1, count), (bigram2, count), ...]}, id2: {'bigrams': ...}

I need to get this in dgCMatrix in R, where the rows are id1, id2, ... and the columns are different bigrams, so the cell is the “count” for that id-bigram.

Any suggestions? I was thinking of expanding it only to massive CSV, but it seems super inefficient and probably unacceptable due to memory limitations.

+5

python r sparse-matrix text-analysis

Craig Jun 05 '15 at 21:15

source share

1 answer

earino · Accepted Answer · 2015-06-05T21:36:01+0000

Could you write the matrix in MatrixMarket format using scipy mmwrite and then read it in R using readMM from Matrix ?

Transport sparse matrix from Python to R

More articles: