User Guide : 14. Using External Tables : How to Add Extra Packages
 
Share this page                  
How to Add Extra Packages
By default, the Spark-Vector Provider supports only the Spark integrated data sources (such as JDBC and JSON) and CSV data sources. (The Spark-Vector Provider is bundled with spark-csv 2.10.)
Follow this process to add extra data sources (packages):
1. Modify $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf (as shown in the following examples).
2. Stop and start the Spark-Vector Provider to put the changes into effect, as follows:
ingstop -spark_provider
ingstart -spark_provider
Here are examples of modifying $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf to add extra data sources:
To add extra jars, add the line:
spark.jars comma-separated-list-of-jars
To add extra packages, add the line:
spark.jars.packages comma-separated-list-of-packages
For example, to enable support for Cassandra (spark-cassandra) and Redshift (spark-redshift), add the line:
spark.jars.packages datastax:spark-cassandra-connector:1.4.4-s_2.10,com.databricks:spark-redshift_2.10:0.6.0
Note:  For Spark 1.5, to preserve a default spark configuration (for example, /etc/spark/conf/spark-defaults.conf), it must be included in $II_SYSTEM/ingres/files/spark-provider/spark_provider.conf.