I am trying to profile an OpenCL application, a.out , on a system with NVIDIA TITAN X and CUDA 8.0.
If it was a CUDA application, nvprof ./a.out would be enough. But I found that this does not work with the OpenCL application, with the message "No kernels have been profiled."
Prior to CUDA 7.5, I successfully used COMPUTE_PROFILE=1 after this . Unfortunately, the documentation says: "Support for the command line profiler using the environment variable COMPUTE_PROFILE was removed in the CUDA 8.0 release."
The question is, is there any other way besides lowering CUDA to the OpenCL application profile using nvprof?
profiling opencl cuda nvprof
csehydrogen
source share