CUDA vs. DataParallel: Why the Difference?

Question

CUDA vs. DataParallel: Why the Difference?

I have a simple neural network model, and I use either cuda() or DataParallel() for the model, as shown below.

 model = torch.nn.DataParallel(model).cuda()

OR

 model = model.cuda()

When I do not use DataParallel, I rather just convert my model to cuda() , I need to explicitly convert the batch inputs to cuda() and then pass it to the model, otherwise it will return the following error.

torch.index_select received an invalid combination of arguments - got (torch.cuda.FloatTensor, int, torch.LongTensor)

But with DataParallel, the code works fine. The rest of the other things are the same. Why is this happening? Why, when I use DataParallel, I don’t need to explicitly convert batch inputs to cuda() ?

+5

pytorch

Wasi ahmad Jun 16 '17 at 3:45

source share

1 answer

Wasi ahmad · Accepted Answer · 2017-06-25T04:46:38+0000

Because DataParallel allows CPU inputs, since the first step is to transfer the inputs to the corresponding GPUs.

Information source: https://discuss.pytorch.org/t/cuda-vs-dataparallel-why-the-difference/4062/3

CUDA vs. DataParallel: Why the Difference?

More articles: