GPGPU: Dark Bleeding?

Is GPGPU ready to use and use prototypes, or do you still consider it primarily a research / bleeding technology? I work in the field of computational biology and begin to attract attention from more computer-oriented people in this area, but most of the work seems to carry well-known algorithms. Porting the algorithm itself is a research project, and the vast majority of people in this area know little about it.

I am doing some pretty intense computing projects on regular multi-core processors. I wonder how close GPGPU is to being able to use it enough to prototype new algorithms and for everyday use of products. From reading Wikipedia, I get the impression that the programming model is strange (highly SIMD) and somewhat limited (there are no recursion or virtual functions, although these restrictions are slowly removed, languages ​​are not higher than C level or a limited subset of C ++), and that there are several competing ones incompatible standards. I also get the impression that, unlike the usual multi-core, fine-grained parallelism is the only game in the city. Basic library functions should be rewritten. Unlike the usual multicore, you cannot get huge accelerations by simply parallelizing the external loop of your program and calling the functions of the old library in the old school.

How serious are these limitations in practice? Is GPGPU ready for serious use now? If not, how long did you realize it would take?

Edit: One of the main points I'm trying to wrap up is how much the programming model is different from a regular multi-core processor with many really slow cores.

Edit # 2: I guess I summarized the answers that were given to me - that GPGPU is practical enough for early adopters in niches, which it fits very well, but still bleeds enough to not be considered a “standard” tool, such as multi-core or distributed parallelism, even in those niches where performance is important.

+4
source share
5 answers

I am a graduate student at CS who has worked a bit with GPGPU. I also know at least one organization that is currently porting parts of its software to CUDA. Regardless of whether it is worth it, whether it really depends on how important your performance is.

I think using CUDA will add a lot of cost to your project. Firstly, the area of ​​GPUs is very fragmented. Even among NVIDIA cards, you have a fairly wide range of feature sets, and some code that runs on one GPU may not work on another. Secondly, the set of CUDA functions, as well as video cards, is changing very quickly. It is possible that everything that you write this year will have to be rewritten in 2-3 years to make full use of the new video cards. Finally, as you point out, writing GPGPU programs is simply very difficult, so parallelizing an existing algorithm for GPGPUs is usually a published research project.

You might want to look into CUDA libraries that already exist, such as CUBLAS, which you could use for your project, and which can help you isolate you from these problems.

+4
source

There is no question that people can do useful, production, computing using GPUs.

Basically, the computations that do well here are those that are pretty close to embarrassing parallelism. Both CUDA and OpenCL will only allow you to express these calculations in a moderately painful way. Therefore, if you can use your calculations in this way, you can succeed. I do not think this restriction will be seriously removed; if they could do it, then common processors could do it. At least I would not hold my breath.

You should be able to tell if your current application is mostly suitable if you look at the existing code. Like most concurrent programming languages, you won’t know your actual performance until you encode the complete application. Unfortunately, there is no alternative to experience.

+5
source

CUDA is now used in the manufacturing code for financial services and is increasing all the time.

Now he is not only “ready for serious use”, you practically missed the boat.

+1
source

Kind of an indirect answer, but I work in the field of nonlinear modeling of the mixed effect in pharmacometry. I heard second-hand information that CUDA was tested. There is such a variety of algorithms used, and new all the time, that some look more SIMD-friendly than others, especially those based on the Markov-Chain Monte Carlo. That's where I suspect financial applications.

Installed simulation algorithms are such large chunks of code in Fortran, and the innermost loops are such complex objective functions that it is difficult to understand how translation could be done, even if opportunities for SIMD acceleration could be found. You can parallelize the outer contours, which we do.

+1
source

Computational biology algorithms tend to be less regular in structure than many of the financial algorithms successfully ported to GPUs. This means that they require some redesign at the algorithm level to capitalize on the massive amount of parallelism found in GPUs. You want to have solid and square data structures and archive your code around large "for" loops with multiple if statements.

It takes some thinking, but it is possible, and we are starting to get interesting performance with a protein layout code parallelized with Ateji PX.

0
source

All Articles