CuPrintf problem

I am trying to copy a struct array to device.I work with one atm GPU, and I have a problem with the cuPrintf function that I use to debug my code.

My definition of structure is as follows:

struct Node { char Key[25]; char ConsAlterKey[25]; char MasterKey[3]; int VowelDeletion; char Data[6]; char MasterData[6]; int Children[35]; int ChildCount; }; 

and for testing purposes, I populate an array of structures as follows:

 void FillArray(Node *NodeArray) { for(int i=0;i<TotalNodeCount;i++) { strcpy(NodeArray[i].Key,"Key"); strcpy(NodeArray[i].ConsAlterKey,"ConsAlterKey"); strcpy(NodeArray[i].MasterKey,"Mk"); NodeArray[i].VowelDeletion=0; strcpy(NodeArray[i].Data,"Data"); strcpy(NodeArray[i].MasterData,"Mdata"); NodeArray[i].ChildCount=5; for(int j =0;j<NodeArray[i].ChildCount;j++) { NodeArray[i].Children[j]=i+j; } } } 

my main function looks like this:

 int main() { Node *NodeArray; Node *GpuTree; int tokenCount=0; int *tokenCountGPU; NodeArray =(Node *)malloc(sizeof(Node)*(TotalNodeCount)); FillArray(NodeArray); printf("Filling test : %s\n", NodeArray[13].Key); gpuAssert(cudaMalloc( (void**)&GpuTree, sizeof(Node)*(TotalNodeCount))); gpuAssert(cudaMemcpy(GpuTree, NodeArray,sizeof(Node)*(TotalNodeCount), cudaMemcpyHostToDevice)); //test value tokenCount=35; gpuAssert( cudaMalloc((void **)&tokenCountGPU, sizeof(int)) ); gpuAssert( cudaMemcpy(tokenCountGPU, &tokenCount, sizeof(int), cudaMemcpyHostToDevice) ); cudaPrintfInit(); Test <<< 1, tokenCount >>> (GpuTree,tokenCountGPU); cudaPrintfDisplay(stdout, true); cudaPrintfEnd(); gpuAssert( cudaGetLastError() ); //TODO:free pointers return(0); } 

and if I write a test function as shown below:

 __global__ void Test(Node *Trie,int *tokenCount) { if (threadIdx.x < *tokenCount) { cuPrintf("%s\n",Trie[threadIdx.x].Key); } return; } 

I get the output as follows:

 Filling test : Key [0, 0]: < [0, 1]: ¢☺! [0, 2]: ì☺! [0, 3]: Γ„β˜»! [0, 4]: oβ™₯! [0, 5]: t♦! [0, 6]: L♣! [0, 7]: $β™ ! [0, 8]: ΓΌβ™ ! [0, 9]: Γ”! [0, 10]: ! [0, 11]: " [0, 12]: \ ! [0, 13]: 4β™‚! [0, 14]: ♀♀! [0, 15]: À♀! !0, 16]: ΒΌ [0, 17]: "β™«! [0, 18]: l☼! [0, 19]: Dβ–Ί! [0, 20]: βˆŸβ—„! [0, 21]: Γ΄β—„! [0, 22]: ΓŒβ†•! [0, 23]: Β€β€Ό! [0, 24]: |ΒΆ! [0, 25]: TΒ§! [0, 26]: ,β–¬! [0, 27]: ♦↨! [0, 28]: Γœβ†¨! [0, 29]: ´↑! [0, 30]: O↓! [0, 31]: dβ†’! [0, 32]: <←! [0, 33]: ¢∟! [0, 34]: ì∟! 

but if I change my testing method to this:

 __global__ void Test(Node *Trie,int *tokenCount) { if (threadIdx.x < *tokenCount) { cuPrintf("%c%c%c\n", Trie[threadIdx.x].Key[0], Trie[threadIdx.x].Key[1], Trie[threadIdx.x].Key[2]); } return; } 

then I get the correct output:

 Filling test : Key [0, 0]: Key [0, 1]: Key [0, 2]: Key [0, 3]: Key [0, 4]: Key [0, 5]: Key [0, 6]: Key [0, 7]: Key [0, 8]: Key [0, 9]: Key [0, 10]: Key [0, 11]: Key [0, 12]: Key [0, 13]: Key [0, 14]: Key [0, 15]: Key [0, 16]: Key [0, 17]: Key [0, 18]: Key [0, 19]: Key [0, 20]: Key [0, 21]: Key [0, 22]: Key [0, 23]: Key [0, 24]: Key [0, 25]: Key [0, 26]: Key [0, 27]: Key [0, 28]: Key [0, 29]: Key [0, 30]: Key [0, 31]: Key [0, 32]: Key [0, 33]: Key [0, 34]: Key 

So the question is, why am I getting corrupted output when I try to print lines using "% s"?


So the problem is resolved. This seems to be due to cuPrintf limitations. And in fact, I did not know about them. Thanks.

Here is a little test:

 __global__ void Test(Node *Trie,int *tokenCount) { const char *Key="Key"; char *KeyPointer="Key"; char KeyArray[4]="Key"; cuPrintf("Constant : %s - Array :%s - Pointer : %s - Casting Pointer : %s - Casting Array : %s\n",Key, KeyArray,KeyPointer,(const char *)KeyPointer,(const char *)KeyArray); //cuPrintf("%s\n",Trie[threadIdx.x].Key); //cuPrintf("%d\n",*tokenCount); } 

Gives output:

  [0, 0]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 1]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 2]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 3]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 4]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 5]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 6]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 7]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 8]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 9]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 10]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 11]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 12]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 13]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 14]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 15]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 16]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 17]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 18]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 19]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 20]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 21]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 22]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 23]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 24]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 25]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 26]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 27]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 28]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 29]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 30]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 31]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 32]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 33]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key [0, 34]: Constant : Key - Array : - Pointer : ♀ - Casting Pointer : Key - Casting Array : Key 
+4
source share
3 answers

Take a look at the cuPrintf documentation (readme is in C / src / simplePrintf / doc / cuPrintf_readme.htm from the base directory where you installed the SDK):

Are there limitations / known issues using cuPrintf , number 2 answers your question:

Limitations / Known Issues

Currently, the following limitations and restrictions apply to cuPrintf:

  • Buffer size is rounded to the nearest factor of 256
  • Arguments associated with% s string format specifiers must be type (const char *)
  • To print the pointer value a (const char *), it must first be converted to (char *). All (const char *) arguments are interpreted as strings.
  • Nonzero return code not conforming to C standard printf ()
  • It is not possible to asynchronously print a print buffer (i.e. when the kernel is running)
  • A call to cudaPrintfDisplay implicitly returns cudaDeviceSynchronize ()
  • The restrictions applied by cuPrintfRestrict are maintained between runs. To clear them from the host side, you must call cudaPrintfEnd (), then cudaPrintfInit () again
  • CuPrintf output undefined if multiple modules are loaded into a single context
  • Compiling with "-arch = sm_11" or better when possible. Buffer usage is much more efficient and register usage below
  • Supported format specifiers: "cdiouxXeEfgGaAs"
  • The behavior of format specifiers, especially justification / size specifiers, depends on the host machine printf implementation
  • cuPrintf requires applications to be created using the CUDA APIs

In your case, you are not using const char* arguments.

+5
source

In the latest update, you need some slenz using sizeof(char) <- when copying. Therefore, it should be:

 gpuAssert( cudaMemcpy(strGPU, str, slenz*sizeof(char), cudaMemcpyHostToDevice)); 
+1
source

One member of your structure

  char MasterKey[3]; 

and when you initialize the objects you execute

  //strcpy(NodeArray[i].MasterKey,"MasterKey"); strcpy(NodeArray[i].MasterKey,"Msk"); /* still too large */ 

which is a little (!) too much for the available space.

+1
source

All Articles