SPARK: Failure: `` union '' is expected, but `('found

I have a dataframe called df with a column called employee_id. I do:

df.registerTempTable("d_f") val query = """SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f""" val result = Spark.getSqlContext().sql(query) 

But the following problem arises. Any help?

 [1.29] failure: ``union'' expected but `(' found SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f ^ java.lang.RuntimeException: [1.29] failure: ``union'' expected but `(' found SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f 
+7
sql scala dataframe apache-spark apache-spark-sql
source share
2 answers

Spark 2.0 +

Spark 2.0 introduces its own implementation of window functions ( SPARK-8641 ), so HiveContext no longer required. However, similar errors not related to window functions can still be attributed to differences between SQL parsers.

Spark & ​​lt; = 1.6

Window functions were introduced in Spark 1.4.0 and require a HiveContext to work. SQLContext does not work here.

Make sure you use Spark> = 1.4.0 and create a HiveContext :

 import org.apache.spark.sql.hive.HiveContext val sqlContext = new HiveContext(sc) 
+14
source share

Yes it's true,

I use spark version 1.6.0 and you need a HiveContext to implement the dense_rank method.

From Spark 2.0.0 in words, there will no longer be a dense_rank method.

So, for Spark 1.4.1.6 <2.0 you should apply this as.

hive_employees table having three fields :: Location: String, name: String, salary: Int

val conf = new SparkConf (). setAppName ("denseRank test") // setMaster ("local")

 val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc) 

val result = hqlContext.sql ("select empid, empname, dense_rank () over (section by empsalary order by empname) as rank from hive_employees")

result.show ()

+1
source share

All Articles