Spark build in hive MySQL metastor not used

Question

Spark build in hive MySQL metastor not used

I am using Apache Spark 2.1.1, and I placed the following hive-site.xml file in the $SPARK_HOME/conf folder:

 <?xml version="1.0"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://mysql_server:3306/hive_metastore?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>password</value> <description>password to use against metastore database</description> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> <description>password to use against metastore database</description> </property> <property> <name>hadoop.tmp.dir</name> <value>${test.tmp.dir}/hadoop-tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://hadoop_namenode:9000/value_iq/hive_warehouse/</value> <description>Warehouse Location</description> </property> </configuration>

When I start a lean server, a metastable schema is created on my MySQL database but not used, Derby is used instead.

Could not find an error in the lean server log file, the only thing that attracts my attention is that it first tries to use MySQL ( INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL ), but then without any error it uses instead Derby ( INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY ). Is this a lean server log https://www.dropbox.com/s/rxfwgjm9bdccaju/spark-root-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-s-master.value-iq.com.out? dl = 0

I don’t have a hive installed on my system, I just pretend to use the built-in Hive Apache Spark.

I am using mysql-connector-java-5.1.23-bin.jar , which is located in the $SPARK_HOME/jars folder.

+7

mysql hive apache-spark spark-thriftserver

Jose Jul 19 '17 at 15:21

source share

1 answer

user1314742 · Accepted Answer · 2017-07-31T11:08:27+0000

As shown in the hive-site.xml file, you did not establish a connection to the metastore. Thus, the spark will use the standard version, which is a local metastar service with a derby backend
I order to use the Metastore service, which has a MySQL DB as the backend, you should:

Start the metastar service. you can see here how to start the hive metastore administration guide service . You start your metastar service using the backend of the MySQL database using the same hive-site.xml file and add the following lines to start the metastar service on METASTORESERVER on port XXXX:
```
 <property> <name>hive.metastore.uris</name> <value>thrift://METASTRESERVER:XXXX</value> </property> 
```
Let the spark know where the metastar service is running. This can be done using the same hive-site.xml that you used when starting the metastar service (with the lines added above), copy this file to the Spark configuration path, then restart your intrinsically safe server

Spark build in hive MySQL metastor not used

More articles: