For the first question: you may want to have more complex tasks than necessary to load less memory at the same time. In addition, this can help with error tolerance, since less work needs to be done in the event of a failure. However, this is a parameter. In general, the answer depends on the type of workload (IO binding, memory binding, processor binding).
Regarding the second, I believe that version 1.3 has some code for dynamically requesting resources. I'm not sure which version the break is in, but older versions just ask for the exact resources with which you configure your driver. As for how the partition moves from one node to another, well, AFAIK, it will select the data for the task from the node, which has a local copy of this data on HDFS. Since hdfs has several copies (by default by default) of each data block, there are several options for starting any given part).
source share