15. Using External Tables
 
Share this page                  
Using External Tables
 
Introduction to External Tables
How External Tables Work
External Table Requirements
Starting and Stopping the Spark-Vector Provider
Spark-Vector Provider Log
Spark-Vector Provider Configuration
Syntax for Defining an External Table
Reading and Writing to an External Table
How to Add Extra Packages
External Table Limitations
External Table Usage Notes
Cohabitation of Vector and Spark-Vector under YARN
Introduction to External Tables
The External Tables feature lets you read from and write to data sources stored outside of Vector. The data source must be one that Apache Spark is able to read from and write to, such as HDFS files stored in formats like Parquet, ORC, JSON, or tables in external database systems.
The syntax CREATE EXTERNAL TABLE creates a Vector table that points at existing data files in locations outside the Vector data directories. This feature eliminates the need to import the data into a new table when the data files are already in a known location, in the desired file format.
After the data file structure is mapped to Vector format using the CREATE EXTERNAL TABLE statement, you can:
Select, join, or sort external table data
Create views for external tables
Insert data into external tables
Import and store the data into a Vector database
The data is queried from its original locations and Vector leaves the data files in place when you drop the table.
The following statements cannot be performed against external tables:
MODIFY
CREATE INDEX