How External Tables Work
For the external table functionality, Vector leverages Apache Spark's extensive connectivity through the Spark-Vector Connector.
External Tables architecture is composed of two main components:
• Spark-Vector Provider
• Vector
Vector receives queries operating on External Tables from the user, rewrites them into JSON requests for external data, which are sent to the Spark-Vector Provider. The Spark-Vector Provider is a Spark application that behaves as a multi-threaded Spark server. It receives requests from Vector, translates them into Spark jobs, and launches them. These jobs typically issue queries (to SparkSQL) like "INSERT INTO vector_table SELECT * FROM external_resource" for reading external data or "INSERT INTO external_resource SELECT * FROM vector_table" for writing to external systems. Finally, these jobs use the Spark-Vector Connector to push and pull data in and out of Vector.
The DBMS configuration parameter insert_external in config.dat controls whether inserts into X100 external table are allowed. The default is ON, which allows inserts. Inserts can be blocked by setting this parameter to OFF.
Last modified date: 06/28/2024