Is it possible to use a Cuda context between applications?

I would like to pass the Cuda context between two independent Linux processes (using the POSIX message queues that I have already configured).

Using cuCtxPopCurrent() and cuCtxPushCurrent() , I can get a context pointer, but this pointer is referenced in the memory of the process in which I call the function, and passing it between processes is pointless.

I am looking for other solutions. My ideas so far:

  • Try to make a deep copy of the CUcontext structure, and then pass the copy.
  • See if I can find a shared memory solution where all my Cuda pointers are placed there so that both processes can access them.
  • Combine processes in one program.
  • It is possible that in Cuda 4.0 there is a better contextual exchange that I could switch to.

I am not sure if option (1) is possible, and if (2) is available or possible. (3) It’s not really an option if I want to make things generic (this is within the capture strip). (4) I will look at Cuda 4.0, but I'm not sure that it will work there either.

Thanks!

+4
source share
2 answers

In a word, no. Contexts are implicitly tied to the thread and the application that created them. There is no portability between individual applications. This is almost the same with OpenGL and various versions of Direct3D, as well - memory exchange between applications is not supported.

CUDA 4 makes the API stream safe, so a single host stream can contain more than one context (for example, more than 1 GPU) at the same time and use the canonical device select API to choose which GPU it works with. This will not help here if I understand your question / application correctly.

+3
source

I do not agree with @talonmies answer. cuCtxCreate () can be called from any process associated with a specific device, although it is loosely coupled to threads. In general, OS threads can change their CUDA contexts using the CUDA driver API, so they would not seem to be permanently attached to the thread or process ID. Process threads themselves make CUDA API calls using any cuda context object that is β€œcurrent” to them, and sometimes it is even quietly initialized for the CUDA client, according to CUDA docs, see here .

There must be a way to share the CUDA context between multiple processes, because this is what CUDA MPS does - a single server contains a CUDA context for multiple CUDA clients. You can write your own CUDA MPS using LD_PRELOAD to intercept CUDA driver API calls (see CUDA code examples, cuHook) and place the CUDA context object somewhere in shared memory between all of your CUDA client processes. It has been documented that CUDA contexts are thread-safe, so blocking mechanisms may not even be necessary unless you want to force an external order. Actual versions of the CUDA driver API can be wrapped in calls to cuCtxPushCurrent () and cuCtxPopCurrent () so that your client code always safely uses the common global CUDA context.

0
source

All Articles