Think of Spark RDD as a regular Seq

Question

Think of Spark RDD as a regular Seq

I have a CLI application to convert JSON. Most of this code is map ping, flatMap ping and moving with for JValues Lists. Now I want to transfer this application to Spark, but it seems to me that I need to rewrite all the functions 1: 1, but write RDD[JValue] instead of List[JValue] .

Is there a way (for example, a type class) for a function to accept both lists and RDD.

+5

scala functional-programming apache-spark rdd

chuwy Aug 26 '15 at 17:40

source share

1 answer

Holden · Answer 1 · 2015-08-26T18:48:30+0000

If you want to share your code to handle local and abstract code, you can move your lambdas / anaonymous functions that you pass to map / flatMap to named functions and reuse them.

If you want to reuse your logic to order maps / flatMaps / etc., you can also create implicit conversions between RDD and Seq into a custom attribute that has only common functions, but implicit conversions can become quite confusing, and I really I don’t think this is a good idea (but you could do it if you disagree with me :)).

Think of Spark RDD as a regular Seq

More articles: