Computational Efficiency: Rare and Complete

I found out that if the matrix is ​​(almost) filled, then its conservation in the rarefied case leads to (large) time for calculation.

Although it is trivial to store the complete matrix in sparse form, I just want to know the reason for this fact.

My assumption is that reading the index in a sparse state will be a major factor in computing time. Any other elegant thoughts?

+4
source share
2 answers

There are several reasons why an almost complete sparse matrix is ​​more computationally expensive than just a full matrix. The most obvious, as you noted, is that rare elements should be indexed (for a common sparse matrix, I believe Matlab uses a Compressed Row storage scheme).

Another, less obvious slowdown is related to vectorization and pipelined data to the processor. In the case of a fully saved matrix, the data is in a neat linear format, so operations can be easily vectorized. For storage schemes such as CRS, this is not so, especially for matrix * vector operations, which are usually most used (for example, when using iterative solvers to solve systems of equations). For the CRS scheme, moving along the matrix row can be fed to the processor using a linear approach, however, elements pulled from the vector by which the matrix is ​​multiplied will jump.

+5
source

Consider the following dense matrix:

1 2 3 4 5 6 7 8 9 

If I store it in a continuous block:

 1 2 3 4 5 6 7 8 9 

I can directly access the elements of the matrix, given the number of rows and columns with some basic arithmetic.

Now consider this sparse matrix:

 1 0 0 0 0 2 0 3 0 

To effectively store this matrix, I discard non-zero elements, so now it becomes

 1 2 3 

But this, obviously, is not enough information to perform operations such as multiplying the matrix vector! Therefore, we need to add additional information to extract elements from the matrix.

But you can see that regardless of the storage method used, we need

  • Do extra work to access items
  • Store additional information to maintain matrix structure.

So, as you can see, the advantages of storage arise only if there are enough zeros in the matrix to compensate for the additional information that we save in order to preserve the structure of the matrix. For example, in the Yale format, we only save memory when the number of non-zero values ​​(NNZ) is less than (m(n βˆ’ 1) βˆ’ 1) / 2 , where m = the number of rows and n = the number of columns.

+5
source

All Articles