Nothing is copied to Scala (in the sense of pass-by-value, which you have in C / C ++) during transmission. Most of the basic types are Int, String, Double, etc. They are immutable, therefore their transfer by reference is very safe. (Note: If you pass a mutable object and you modify it, then anyone who refers to this object will see the change).
In addition, RDDs are lazy, distributed, immutable collections. Passing RDD through functions and applying transformation to them (map, filter, etc.) Actually does not pass any data or does not cause any calculations.
All chain conversions are “remembered” and automatically work in the correct order when you perform a forced execution, and action on the RDD, for example, save it or collect it locally from the driver (via collect() , take(n) , etc.)
marios
source share