Query Processing
A typical interaction with Vector consists of at least three processes:
• Vector tool or application
• DBMS Server process
• X100 Engine process
The Tool or Application Process
The Vector tool or application performs the following actions:
• Takes user input and issues a query that is sent to the DBMS Server.
• Formats, optimizes, and executes the query on behalf of the user.
• Displays the resulted data returned by the DBMS Server.
DBMS Server Process
The DBMS Server is a multi-threaded process. It can execute queries for many users, each running a Vector application. Even though it is a single process, the DBMS Server can execute queries as multiple “sessions” on behalf of multiple users. You can view which sessions are running in the DBMS Server at any moment by using the ipm and iimonitor tools.
Relationship Between DBMS Server and X100 Engine
The DBMS Server handles client connections, handles all SQL issues including parsing SQL queries, and generates detailed query plans in the form of “X100 algebra” for the X100 Engine. The X100 Engine executes the translated query, using column store, vector processing, and other advanced technologies to maximize query performance.
VectorH includes a full Vector installation on a single node (referred to as the master node), including the complete Vector toolset, the Ingres front-end, and the X100 engine. Client applications connect to the Ingres front-end on this node.
The engine creates a distributed query plan that is communicated to all nodes participating in the query execution (referred to as the slave nodes). These nodes are running the engine, which is installed automatically during VectorH setup.
All X100 engines on the master node and slave nodes participate in query execution. The engines may communicate intermediate query results to each other when needed. Partial query results get communicated from the slave engines to the master engine. The partial results get aggregated and the end result is sent back to the client. Because of this architecture, the VectorH master node and all the slave nodes are installed on Hadoop DataNodes to ensure data locality and local access to the database files.
Query Environment
When a thread or session executes a query inside the DBMS Server, it does so in a query environment. The query environment consists of:
• A quantity of resources available from the operating system for use by the session.
• The rules under which the query is executed. These rules reflect which query language is used, which locking strategy is employed, which diagnostic information is returned, which default behavior Vector adopts for various query language statements, and so on.