I use a standalone cluster in my local windows and try to download data from one of our servers using the following code -
from pyspark.sql import SQLContext sqlContext = SQLContext(sc) df = sqlContext.load(source="jdbc", url="jdbc:postgresql://host/dbname", dbtable="schema.tablename")
I set SPARK_CLASSPATH as -
os.environ['SPARK_CLASSPATH'] = "C:\Users\ACERNEW3\Desktop\Spark\spark-1.3.0-bin-hadoop2.4\postgresql-9.2-1002.jdbc3.jar"
When executing sqlContext.load, it displays the error message โNo suitable driver for jdbc: postgresql foundโ. I tried searching on the internet but could not find a solution.
postgresql jdbc apache-spark pyspark apache-spark-sql
Soni shashank
source share