How to achieve sorting by value in spark java

JavaPairRDD<String, Float> counts = ones .reduceByKey(new Function2<Float, Float, Float>() { @Override public Float call(Float i1, Float i2) { return i1 + i2; } }); 

My conclusion is:

 id,value 100002,23.47 100003,42.78 200003,50.45 190001,30.23 

I would like the result to be sorted by value, for example:

 200003,50.45 100003,42.78 190001,30.23 100002,23.47 

How to do it?

+5
source share
2 answers

I think there is no special API for sorting data by value.

You may need to follow these steps:

1) Replace key and value
2) Use the sortByKey API
3) Replace key and value

Take a look at more details on sortByKey in the beloe reference:
https://spark.apache.org/docs/1.0.0/api/java/org/apache/spark/api/java/JavaPairRDD.html#sortByKey%28boolean%29

for swap, we can use the Scala Tuple API:

http://www.scala-lang.org/api/current/index.html#scala.Tuple2

For example, I have a Java Pair RDD from the function below.

 JavaPairRDD<String, Integer> counts = ones.reduceByKey(new Function2<Integer, Integer, Integer>() { @Override public Integer call(Integer i1, Integer i2) { return i1 + i2; } }); 

Now, to change the key and value, you can use the code below:

 JavaPairRDD<Integer, String> swappedPair = counts.mapToPair(new PairFunction<Tuple2<String, Integer>, Integer, String>() { @Override public Tuple2<Integer, String> call(Tuple2<String, Integer> item) throws Exception { return item.swap(); } }); 

Hope this helps. You need to take care of data types.

+3
source

Scala has a nice sortBy method. The Java equivalent could not be found, but this is a scala implementation:

  def sortBy[K]( f: (T) => K, ascending: Boolean = true, numPartitions: Int = this.partitions.size) (implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[T] = this.keyBy[K](f) .sortByKey(ascending, numPartitions) .values 

So basically, similarly above, but it adds the key instead of swapping back and forth. I use it as follows: .sortBy(_._2) (sort by selecting the second element of the tuple).

+3
source

Source: https://habr.com/ru/post/1215204/


All Articles