How does createOrReplaceTempView work in Spark?

I am new to Spark and Spark SQL.

How does createOrReplaceTempView work in Spark?

If we register RDD objects, since the table will ignite all the data in memory?

+5
source share
2 answers

createOrReplaceTempView creates (or replaces, if that view name already exists) lazily evaluates the "view", which can then be used as a hive table in Spark SQL. It is not stored in memory unless you cache the dataset underlying the view.

 scala> val s = Seq(1,2,3).toDF("num") s: org.apache.spark.sql.DataFrame = [num: int] scala> s.createOrReplaceTempView("nums") scala> spark.table("nums") res22: org.apache.spark.sql.DataFrame = [num: int] scala> spark.table("nums").cache res23: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [num: int] scala> spark.table("nums").count res24: Long = 3 

Data is fully cached only after a .count call. Here the evidence was cached:

Cached nums temp view / table

Bound SO: spark createOrReplaceTempView vs createGlobalTempView

Corresponding quote (compared to the constant table): "Unlike the createOrReplaceTempView command, saveAsTable will materialize the contents of the DataFrame and create a pointer to the data in the beehive metaphor." from https://spark.apache.org/docs/latest/sql-programming-guide.html#saving-to-persistent-tables

Note: createOrReplaceTempView before registerTempTable

+11
source

CreateOrReplaceTempView will create a temporary representation of the table in memory, at the moment it is not a candidate, but you can run the sql query on top of this. if you want to save it, you can either save or save saveAsTable to save.

first we read the data in csv format, and then convert it to a data frame and create a temporary representation

Reading data in csv format

 val data = spark.read.format("csv").option("header","true").option("inferSchema","true").load("FileStore/tables/pzufk5ib1500654887654/campaign.csv") 

print circuit

data.printSchema

SchemaOfTable

 data.createOrReplaceTempView("Data") 

Now we can run sql queries at the top of the table we just created

  %sql select Week as Date,Campaign Type,Engagements,Country from Data order by Date asc 

enter image description here

+3
source

All Articles