I am wondering if there is a difference between:
// cumalloc.c - Create a device on the device HOST float * cudamath_vector(const float * h_vector, const int m) { float *d_vector = NULL; cudaError_t cudaStatus; cublasStatus_t cublasStatus; cudaStatus = cudaMalloc(&d_vector, sizeof(float) * m ); if(cudaStatus == cudaErrorMemoryAllocation) { printf("ERROR: cumalloc.cu, cudamath_vector() : cudaErrorMemoryAllocation"); return NULL; } /* THIS: */ cublasSetVector(m, sizeof(*d_vector), h_vector, 1, d_vector, 1); /* OR THAT: */ cudaMemcpy(d_vector, h_vector, sizeof(float) * m, cudaMemcpyHostToDevice); return d_vector; }
cublasSetVector() has two arguments incx and incy , and the documentation says :
The storage distance between successive elements is determined by the expression incx for the source vector x and for the destination vector y.
At the NVIDIA forum, someone said:
iona_me: "incx and incy are steps measured in floats."
Does this mean that for incx = incy = 1 all elements of a float[] will be sizeof(float) -licensed, and for incx = incy = 2 will be sizeof(float) -packing between each element?
- Except for these two parameters and
cublasHandle - does cublasSetVector() do anything else that cudaMalloc() does not? - Is it possible to save a vector / matrix that was not created with their corresponding
cublas*() function for other CUBLAS functions to manage them?
cuda cublas
Stefan falk
source share