Loading a SparkR data frame in Hive

Question

Loading a SparkR data frame in Hive

I need to load a DataFrame created in SparkR to load in Hive.

#created a dataframe df_test df_test <- createDataFrame(sqlContext, data.frame(mon = c(1,2,3,4,5), year = c(2011,2012,2013,2014,2015))) #initialized the Hive context >sc <- sparkR.init() >hiveContext <- sparkRHive.init(sc) #used the saveAsTable fn to save dataframe "df_test" in hive table named "table_hive" >saveAsTable(df_test, "table_hive")

08/16/24 23:08:36 ERROR RBackendHandler: saveAsTable at 13 failed Error in invokeJava (isStatic = FALSE, objId $ id, methodName, ...): java.lang.RuntimeException: Tables created using SQLContext, must be TEMPORARY. Use a HiveContext instead. in scala.sys.package $ .error (package.scala: 27) at org.apache.spark.sql.execution.SparkStrategies $ DDLStrategy $ .apply (SparkStrategies.scala: 392) at org.apache.spark.sql.catalyst .planning.QueryPlanner $$ anonfun $ 1.apply (QueryPlanner.scala: 58) at org.apache.spark.sql.catalyst.planning.QueryPlanner $$ anonfun $ 1.apply (QueryPlanner.scala: 58) in scala.collection.Iterator $$ anon $ 13.hasNext (Iterator.scala: 371) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan (QueryPlanner.scala: 59) at org.apache.spark.sql.execution.QueryExecution.sparkPlan $ lzycompute (QueryExecution.scala: 47) at org.apache.spark.sql.execution.QueryExecution.sparkPlan (QueryExecution.scala: 45) at org.apache.spark.sql.execution.QueryExecution.executedPlan $ lzycompute (QueryExecution.ala : 52) at org.apache.spark.sql.execution.QueryExecution.executedPlan (QueryExecution.scala: 52) at org. apache.spark.sql.execution

Throws the above error. Please help.

0

r apache-spark sparkr

Arun gunalan Aug 24 '16 at 17:43

source share

1 answer

zero323 · Accepted Answer · 2016-08-24T18:24:36+0000

Not enough HiveContext in scope. Each data frame is bound to a specific SQLContext / SparkSession , and df_test explicitly created with a different context than HiveContext

Let's illustrate this with an example:

  Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.1 /_/ Spark context is available as sc, SQL context is available as sqlContext > library(magrittr) > createDataFrame(sqlContext, mtcars) %>% saveAsTable("foo") 16/08/24 20:22:13 ERROR RBackendHandler: saveAsTable on 22 failed Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead. at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.execution.SparkStrategies$DDLStrategy$.apply(SparkStrategies.scala:392) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:396) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:47) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:45) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:52) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:52) at org.apache.spark.sql.execu > > hiveContext <- sparkRHive.init(sc) > createDataFrame(hiveContext, mtcars) %>% saveAsTable("foo") NULL

Loading a SparkR data frame in Hive

More articles: