How to specify multiple tables in Spark SQL?

I have code where I need to set three tables. To do this, I need to call the jdbc function three times for each table. See code below

 val props = new Properties props.setProperty("user", "root") props.setProperty("password", "pass") val df0 = sqlContext.read.jdbc( "jdbc:mysql://127.0.0.1:3306/Firm42", "company", props) val df1 = sqlContext.read.jdbc( "jdbc:mysql://127.0.0.1:3306/Firm42", "employee", props) val df2 = sqlContext.read.jdbc( "jdbc:mysql://127.0.0.1:3306/Firm42", "company_employee", props) df0.registerTempTable("company") df1.registerTempTable("employee") df2.registerTempTable("company_employee") val rdf = sqlContext.sql( """some_sql_query_with_joins_of_various_tables""".stripMargin) rdf.show 

Is it possible to simplify my code? Or maybe there is a way to specify multiple tables somewhere in the SQL configuration.

+5
source share
1 answer

DRY :

 val url = "jdbc:mysql://127.0.0.1:3306/Firm42" val tables = List("company", "employee", "company_employee") val dfs = for { table <- tables } yield (table, sqlContext.read.jdbc(url, table, props)) for { (name, df) <- dfs } df.registerTempTable(name) 

Are data frames needed? Skip the first loop:

 for { table <- tables } sqlContext.read.jdbc(url, table, props).registerTempTable(table) 
+2
source

All Articles