SPARK: Failure: `` union '' is expected, but `('found

Question

SPARK: Failure: `` union '' is expected, but `('found

I have a dataframe called df with a column called employee_id. I do:

df.registerTempTable("d_f") val query = """SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f""" val result = Spark.getSqlContext().sql(query)

But the following problem arises. Any help?

 [1.29] failure: ``union'' expected but `(' found SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f ^ java.lang.RuntimeException: [1.29] failure: ``union'' expected but `(' found SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f

+7

sql scala dataframe apache-spark apache-spark-sql

user1735076 Aug 3 '15 at 12:09

source share

2 answers

zero323 · Answer 1 · 2015-08-03T12:38:14+0000

Spark 2.0 +

Spark 2.0 introduces its own implementation of window functions ( SPARK-8641 ), so HiveContext no longer required. However, similar errors not related to window functions can still be attributed to differences between SQL parsers.

Spark & lt; = 1.6

Window functions were introduced in Spark 1.4.0 and require a HiveContext to work. SQLContext does not work here.

Make sure you use Spark> = 1.4.0 and create a HiveContext :

 import org.apache.spark.sql.hive.HiveContext val sqlContext = new HiveContext(sc)

Pelab · Answer 2 · 2016-07-07T10:22:28+0000

Yes it's true,

I use spark version 1.6.0 and you need a HiveContext to implement the dense_rank method.

From Spark 2.0.0 in words, there will no longer be a dense_rank method.

So, for Spark 1.4.1.6 <2.0 you should apply this as.

hive_employees table having three fields :: Location: String, name: String, salary: Int

val conf = new SparkConf (). setAppName ("denseRank test") // setMaster ("local")

 val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

val result = hqlContext.sql ("select empid, empname, dense_rank () over (section by empsalary order by empname) as rank from hive_employees")

result.show ()

SPARK: Failure: `` union '' is expected, but `('found

More articles: