Think of Spark RDD as a regular Seq

I have a CLI application to convert JSON. Most of this code is map ping, flatMap ping and moving with for JValues ​​Lists. Now I want to transfer this application to Spark, but it seems to me that I need to rewrite all the functions 1: 1, but write RDD[JValue] instead of List[JValue] .

Is there a way (for example, a type class) for a function to accept both lists and RDD.

+5
source share
1 answer

If you want to share your code to handle local and abstract code, you can move your lambdas / anaonymous functions that you pass to map / flatMap to named functions and reuse them.

If you want to reuse your logic to order maps / flatMaps / etc., you can also create implicit conversions between RDD and Seq into a custom attribute that has only common functions, but implicit conversions can become quite confusing, and I really I don’t think this is a good idea (but you could do it if you disagree with me :)).

+2
source

All Articles