How to Add Extra Packages
By default, the Spark Provider supports only the Spark integrated data sources (such as JDBC and JSON) and CSV data sources. (The Spark Provider is bundled with spark-csv 2.10.)
Follow this process to add extra data sources (packages):
1. Modify $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf (as shown in the following examples).
2. Stop and start the Spark Provider as installation owner to put the changes into effect, as follows:
ingstop -spark_provider
ingstart -spark_provider
Here are examples of modifying $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf to add extra data sources:
• To add extra jars, add the line:
spark.jars comma-separated-list-of-jars
• To add extra packages, add the line:
spark.jars.packages comma-separated-list-of-packages
For example, to enable support for Cassandra (spark-cassandra) and Redshift (spark-redshift), add the line:
spark.jars.packages datastax:spark-cassandra-connector:1.4.4-s_2.10,com.databricks:spark-redshift_2.10:0.6.0
Note: For Spark 1.5, to preserve a default spark configuration (for example, /etc/spark/conf/spark-defaults.conf), it must be included in $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf.
Last modified date: 08/14/2024