Configuring Spark to Use JDBC
You can use Spark to map tables from other databases using JDBC.
Follow this process to configure Spark to use JDBC:
1. Download the JAR files for JDBC connection based on the database. Examples:
Vector: iijdbc.jar
Oracle: odbc8.jar | odbc9.jar (based on Oracle version)
Zen database: pvjdbc2.jar, pvjdbc2x.jar, and jpscs.jar
2. Set II_SPARKJARS environment variable to the path where the JDBC JARs reside.
Either issue the command ingsetenv II_SPARKJARS “/path to jars” or add the export II_SPARKJARS=path to jars command to the .ingXXsh environment file.
3. Source .ingXXsh, where XX is the Vector instance ID.
4. Check that II_SPARKJARS has been updated by using the following command:
echo $II_SPARKJARS
5. Run the following command to configure JDBC:
iisuspark –jdbc
Lines like the following are added to $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf:
spark.executor.extraClassPath /somepath/iijdbc.jar:/somepath/odbc8.jar
spark.driver.extraClassPath /somepath/iijdbc.jar:/somepath/odbc8.jar
6. Restart the Spark-Vector Provider and ensure the new JAR files have been found. If the JARs are not found, you will see the following errors:
ERROR SparkContext: Jar not found at file:/somepath/iijdbc.jar
ERROR SparkContext: Jar not found at file:/somepath/odbc8.jar
After Spark is configured, you can connect to a database and create an external table using FORMAT= 'jdbc':
CREATE EXTERNAL TABLE ext_jdbc_hello_ingres
(id INTEGER NOT NULL,
txt VARCHAR(20)NOT NULL)
USING SPARK WITH REFERENCE='dummy',
FORMAT='jdbc',
OPTIONS=('url' = 'db_connection_url',
'dbtable' = 'table_name',
'user' = '<username>',
'password' = 'password')
Last modified date: 12/06/2024