How to sort by column in descending order in Spark SQL?

I tried df.orderBy("col1").show(10) but sorted in ascending order. df.sort("col1").show(10) also sorted in descending order. I looked at stackoverflow and the answers I found were deprecated or related to RDD . I would like to use my own framework in a lawsuit.

+100
scala apache-spark apache-spark-sql
May 19 '15 at 17:45
source share
6 answers

This is in org.apache.spark.sql.DataFrame for the sort method:

 df.sort($"col1", $"col2".desc) 

Notice the $ and .desc inside sort for the column by which the results are sorted.

+67
May 19 '15 at 17:48
source share
— -

You can also sort the column by importing spark sql functions

 import org.apache.spark.sql.functions._ df.orderBy(asc("col1")) 

Or

 import org.apache.spark.sql.functions._ df.sort(desc("col1")) 

import sqlContext.implicits._

 import sqlContext.implicits._ df.orderBy($"col1".desc) 

Or

 import sqlContext.implicits._ df.sort($"col1".desc) 
+170
Aug 17 '15 at 14:23
source share

PySpark only

I came across this post when I wanted to do the same in PySpark. The easiest way is to simply add the ascending = False parameter:

 df.orderBy("col1", ascending=False).show(10) 

Link: http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.DataFrame.orderBy

+22
Nov 11 '17 at 15:22
source share
 import org.apache.spark.sql.functions.desc df.orderBy(desc("columnname1"),desc("columnname2"),asc("columnname3")) 
+7
Sep 11 '18 at 12:31 on
source share
 df.sort($"ColumnName".desc).show() 
+6
Nov 09 '17 at 10:38 on
source share

In the case of Java:

If we use DataFrames , applying joins (here Inner join), we can sort (in ASC) after selecting different elements in each DF as:

 Dataset<Row> d1 = e_data.distinct().join(s_data.distinct(), "e_id").orderBy("salary"); 

where e_id is the column to which the join applies when sorting by salary in ASC.

We can also use Spark SQL as:

 SQLContext sqlCtx = spark.sqlContext(); sqlCtx.sql("select * from global_temp.salary order by salary desc").show(); 

Where

  • spark → SparkSession
  • Salary → GlobalTemp View.
+2
Sep 06 '18 at 16:12
source share



All Articles