Faster GPU XML Parsing

I need to improve the performance of a piece of software that parses XML files and adds their contents to a large SQL database. I tried to find information on whether this can be implemented on the GPU. My research on both CUDA and OpenCL left me with no clear answers, except that the software can be developed in C / C ++, FORTRAN and many other languages ​​using compiler directives to enable GPU processing . This leads me to ask this question: do I really need an API or a library written to speed up the GPU, or a program written in C / C ++ using the standard XML Parsing library and compiled using compiler directives for CUDA / OpenCL, it automatically starts XML library functions on the GPU?

+7
xml xml-parsing gpgpu
source share
3 answers

In fact, I see no reason in parsing XML on the GPU. The GPU architecture focuses on massively calculating floating point numbers, rather than operations such as word processing. I think it's much better to use a processor and split XML parsing between threads in order to use multiple cores. Using a GPU in such an application is, in my opinion, redundant.

+2
source share

In general, GPUs are not suitable for speeding up XML processing ... GPUs are great only if the planned task has massive parallelism for using a large number of GPU processing units. XML processing, on the other hand, is largely a single-threaded state machine of a transitional type of job.

+2
source share

First, consider the structure of your xml. By following this link, you can find criteria for an XML structure suitable for parallel processing. Concurrent XML parsing in Java

If your xml structure is parallel to the process, then a few ideas:

As I know, parsing XML requires a stack structure in order to remember the current position in the tree and verify that the nodes open and close correctly.

The stack structure can be represented as a one-dimensional array with a stack pointer. The stack pointer contains the position of the top stack element in the array.

They say that you can store arrays in 1D textures (maximum 4096 elements). Or in 2D textures (maximum 16,777,216 = 4,096x4,096 elements) ... See the following link for more information https://developer.nvidia.com/gpugems/GPUGems2/gpugems2_chapter33.html

if you assign a separate floating point number to each unique element name, then you can store elements as numbers

if you take the input text as an array of ascii / utf-8 codes, then why not save them as an array of floating point numbers?

The last thing that is important for using the GPU is the output structure.

If you need, for example, a column of fixed-length columns, then it is only about how to present such a structure in a 1D or 2D array of floating point numbers

When you are confident in the previous moments, and the GPU is right for you, just write functions to convert your data into textures and textures back to your data.

And then, of course, the whole XML parser ...

I have never tried programming using the GPU, but very soon it seems to me that something is impossible ...

Someone should be the first to create the whole algorithm and try to use the GPU efficiently or not.

0
source share

All Articles