How to implement descriptors for the CUDA driver API library?

Note: The question has been updated to address the questions that have been raised in the comments, and to emphasize that the core of the question is about the interdependencies between the Runtime- and Driver API

CUDA runtime libraries (such as CUBLAS or CUFFT) typically use the concept of a "descriptor" that summarizes the state and context of such a library. The usage pattern is pretty simple:

// Create a handle
cublasHandle_t handle;
cublasCreate(&handle);

// Call some functions, always passing in the handle as the first argument
cublasSscal(handle, ...);

// When done, destroy the handle
cublasDestroy(handle);

However, there are many subtle details about how these descriptors interact with Driver- and runtime contexts, as well as with multiple threads and devices. The documentation lists a few disparate details about context processing:

, , , (, , cuCtxSetCurrent cuCtxPushCurrent cuCtxPopCurrent?), , , , " API , , , "", .


, , "" , .

, API , : . , , Runtime- . API .

, Driver API. , PTX CUmodule CUfunction. - - , API , , " ".

, , .

( "" , , ... Java, )

1. "" - , , /, :

class Handle 
{
    CUcontext context;
    boolean usingPrimaryContext;
    CUdevice device;
}

2. : , . . ( ) :

Handle createHandle()
{
    cuInit(0);

    // Obtain the current context
    CUcontext context;
    cuCtxGetCurrent(&context);

    CUdevice device;

    // If there is no context, use the primary context
    boolean usingPrimaryContext = false;
    if (context == nullptr)
    {
        usingPrimaryContext = true;

        // Obtain the device that is currently selected via the runtime API
        int deviceIndex;
        cudaGetDevice(&deviceIndex);

        // Obtain the device and its primary context
        cuDeviceGet(&device, deviceIndex);
        cuDevicePrimaryCtxRetain(&context, device));
        cuCtxSetCurrent(context);
    }
    else
    {
        cuCtxGetDevice(device);
    }

    // Create the actual handle. This might internally allocate
    // memory or do other things that are specific for the context
    // for which the handle is created
    Handle handle = new Handle(device, context, usingPrimaryContext);
    return handle;
}

3. :

void someLibraryFunction(Handle handle)
{
    cuCtxSetCurrent(handle.context);
    callMyKernel(...);
}

, , . , .

4. , , cuDevicePrimaryCtxRelease, :

void destroyHandle(Handle handle)
{
    if (handle.usingPrimaryContext)
    {
        cuDevicePrimaryCtxRelease(handle.device);
    }
}

, , , CUBLAS, . , , , , , , .

, :

  • - ""?
  • - (, ), , "" CUBLAS?
  • : ""?
  • : - CUBLAS?

( tenorflow, , , ...)

(An "Update" has been removed here, because it was added in response to the comments, and should no longer be relevant)

+14

All Articles