Is there sparse support for the dist function in R?

Has anyone heard of any packages or functions that work in the same way as the dist{stats} function from R that creates

which is calculated using the specified distance measure to calculate the distances between the rows of the data matrix,

but take the sprase matrix as input?

My data.frame (named dataCluster ) has dims: 7000 X 10000 and almost 99% is sparse. In a regular form that is not sparse, this function seems to stop working ...

 h1 <- hclust( dist( dataCluster ) , method = "complete" ) 

Unanswered similar question: Rare matrix as input for hierarchical clustering in R

+5
source share
1 answer

You want wordspace::dist.matrix .

It accepts sparse matrices from the Matrix package (which is not clear from the documentation), and can also perform cross distances, output Matrix and dist objects, etc.

The default distance value is 'cosine' , so be sure to specify method = 'euclidean' if you want to.

+2
source

All Articles