If I have a file and I made a zipWithIndex RDD for each line,
([row1, id1001, name, address], 0) ([row2, id1001, name, address], 1) ... ([row100000, id1001, name, address], 100000)
Can I get the same index order if I reload the file? Since it works in parallel, can other lines be split differently?
source share