CUDA vs. DataParallel: Why the Difference?

I have a simple neural network model, and I use either cuda() or DataParallel() for the model, as shown below.

 model = torch.nn.DataParallel(model).cuda() 

OR

 model = model.cuda() 

When I do not use DataParallel, I rather just convert my model to cuda() , I need to explicitly convert the batch inputs to cuda() and then pass it to the model, otherwise it will return the following error.

torch.index_select received an invalid combination of arguments - got (torch.cuda.FloatTensor, int, torch.LongTensor)

But with DataParallel, the code works fine. The rest of the other things are the same. Why is this happening? Why, when I use DataParallel, I don’t need to explicitly convert batch inputs to cuda() ?

+5
pytorch
source share
1 answer

Because DataParallel allows CPU inputs, since the first step is to transfer the inputs to the corresponding GPUs.

Information source: https://discuss.pytorch.org/t/cuda-vs-dataparallel-why-the-difference/4062/3

+6
source share

All Articles