Solve only small symmetric positive definite Ax = b only on the GPU

I am trying to optimize the application in real-time 3D modeling. The computational part of the application runs almost entirely on the GPU in CUDA. The application requires solving a small (6x6) double precision symmetric positive definite linear system Ax = b 500+ times per second. Currently, this is done using an effective processor-based linear linear algebra library using Cholesky, but requires copying data from the CPU-GPU and back to the GPU hundreds of times per second, and kernel overheads start every time, etc.

How can I calculate the linear system solution on the GPU without the need to receive data on the CPU? I got a little familiar with the MAGMA library, but it seems to use hybrid algorithms, not just GPU algorithms.

I'm ready for the fact that the solution of a separate linear system on the GPU will be much slower than with the existing processor-based library, but I want to see if this can be compensated by removing data transfer between the host and the device and the overhead of starting the kernel, etc. d. hundreds of times per second. If there is no GPU just a LAPACK-like alternative, how could I implement something to solve this particular 6x6 case only on the GPU? Can this be done without the huge expense of time using the GPU BLAS libraries?

+8
algorithm gpu cuda linear-algebra solver
source share
1 answer

NVIDIA published the code for the Ax = b package solution on its registered developer site last fall. This code works for generic matrices and should work well enough for your needs if you can expand symmetric matrices to full matrices (shouldn't that be a problem for 6x6?). Since the code performs a rotation that is not needed for positive definite matrices, it is not optimal for your case, but you can change it for your own purposes, since the code is under the BSD license.

There are currently some issues on the NVIDIA developer website. Here is how you can download the solution batch code at this time:

(1) Go to http://www.nvidia.com/content/cuda/cuda-toolkit.html

(2) If you have an existing NVdeveloper account (for example, through partner.nvidia.com), click the green “Login to nvdeveloper” link in the right half of the screen. Otherwise, click "Join nvdeveloper" to apply for a new account; Requests for new accounts are usually approved within one business day.

(3) Enter the invitation with your email address and password

(4) There is a section on the right side called “New Downloads”. The fifth element on top is the Batched Solver. Click on it and it will lead you to the download page for the code.

(5) Click the download link, then click Accept to accept the license terms. Your download should begin.

+3
source share

All Articles