Sparking the maximum value and its associated key

My question is based on this question . I have a pair of spark RDD (key account) [(a,1), (b,2), (c,1), (d,3)].

How can I find both keys with the highest score and the actual score?

+4
source share
1 answer
(sc
    .parallelize([("a",1), ("b",5), ("c",1), ("d",3)])
    .max(key=lambda x:x[1]))

returns ('b', 5), not just 5. The first parameter maxis the key used for comparison (explained here), but max still returns an integer value, here is the complete set.

+5
source

All Articles