Should I shuffle the old cluster after vnodes migration?
You do not need. If you switch from one token to node to 256 (the default), each node will divide its range into 256 adjacent ranges of the same size. This does not affect where the data lives. But this means that when you load a new node in a new DC, it will remain balanced throughout the process.
What is the best way to switch to NetworkTopologyStrategy and GossipingPropertyFileSnitch?
The difficulty is that the replication failover strategy is generally unsafe because the data must move across the cluster. NetworkToplogyStrategy (NTS) will host data on different nodes if you specify that the nodes are in different racks. For this reason, you must go to NTS before adding new nodes.
Here's how to do it after you upgrade your old cluster to vnodes (your step 1 above):
1a. List all existing nodes as being in DC0 in the properties file. List the new nodes as being in DC1 and their correct racks.
1b. Change the replication strategy to NTS with parameters DC0: 3 (or whatever your current replication rate is) and DC1: 0.
Then, to add new nodes, follow these steps: http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-a-data-center-to-a-cluster . Do not forget to set the number of tokens to 256, as the default will be 1.
In step 5, you must set the replication coefficient for DC0 to 0, that is, change the replication settings to DC0: 0, DC1: 3. Now these nodes are not used, so deactivation will not transmit any data, but you should do it anyway rather than disconnecting them so that they are removed from the ring.
Note that one risk is that entries made with a low level of consistency with old nodes may be lost. To avoid this, you can write to CL.LOCAL_QUORUM after switching to a new DC. There is still a small window where records may be lost (between steps 3 and 4). If this is important, you can start the repair before decommissioning the old components to ensure there is no loss or recording at a high level of consistency.