User Guide : 2. Vector Concepts
 
Share this page                  
Vector Concepts
Table Structures
In broad terms, Vector uses columnar storage. While data is stored and retrieved in familiar relational rows, the internal storage is different. Instead of storing all column values for a single row next to each other, all rows for a single column are stored together. This storage format has benefits for data warehouse applications, such as reduced I/O and improved efficiency.
A relational table may have dozens or hundreds of columns. In traditional row-oriented relational storage, the entire row must be read even if a query requests only a few columns. With column-oriented storage, a query that needs only a few columns will not read the remaining columns from disk and will not waste memory storing unnecessary values. A side benefit of this columnar storage is that compression (see Data Compression) can be more effective.
While Vector is fundamentally a column store, it is not strictly so. For certain types of tables, Vector stores data from more than one column together in a data block on disk. Within a data block the values for each column are kept together so that data for a column can be processed efficiently in vectors. Vector also provides a storage type that lets you indicate you want to store all columns for a table in a single block.
In VectorH, tables are also partitioned horizontally, based on a hash value of the partition key column(s). A single data file stored in HDFS will contain data for a single column (with the exceptions mentioned above) and a single partition.
Data Storage Format
A database consists of multiple files and can reside in multiple locations. Tables can also be spread across multiple locations.
Updates, inserts, and deletes to the data and the layout of the data in the data files are stored as log information on HDFS in II_HDFSDATA/ingres/data/vectorwise/dbname/wal). The data files and the wal directory must be considered together as files that represent the database.
The data files consist of several blocks, which can be seen as the equivalent of pages. Each block contains possibly compressed data from one or more attributes. The size of the block can be configured with the [cbm] block_size parameter before creating a database.
Also, for better sequential access performance, multiple blocks from the same column can be stored contiguously in a block group. The size of this group can be configured with the [cbm] group_size parameter.
Database size is unlimited because data can be spread across multiple disks. If a table or a column is dropped, its corresponding file or files are deleted, and the disk space is reclaimed by the operating system.
Note:  Due to concurrent transactions, the system can internally still contain references to a table or a column even when it is dropped, so the effect of dropping a table or column may not be seen immediately but only after all concurrent transactions have ended.
Raw Table Storage Format
The default storage format for a Vector table (VECTORWISE) is also known as a RAW table. This format stores the data on disk in columns in the order in which it is inserted, and the data is compressed.
In addition, for each column the system automatically maintains simple statistics about the data distribution by storing the minimum and maximum value for ranges of tuples. If a query includes a filter on the column data value, the execution engine uses these min-max data values to decide whether to examine the data in a given block. Since min-max information is maintained in memory outside of the data block, this simple form of automatic indexing can dramatically reduce disk I/O and memory usage. You can configure the granularity of the min-max index with the [cbm] minmax_maxsize parameter.
Data Compression
Columnar storage inherently makes compression (and decompression) more efficient than does row-oriented storage.
For row-oriented data, choosing a compression method that works well for the variety of data types in a row can be challenging, because compression for text and numeric data work best with different algorithms.
Column storage allows the algorithm to be chosen according to the data type and the data domain and range, even where the domain and range are not declared explicitly. For example, an alphabetic column GENDER defined as CHAR(1) will have only two actual values (M and F), and rather than storing an eight-bit byte, the value can be compressed to a single bit, and then the bit string can be further compressed.
Vector uses different algorithms from those found in most other products. Because Vector processes data so efficiently, compression —and especially decompression—are designed to use little CPU and to reduce disk I/O. While on-disk compression ratios may be slightly lower than other products, overall performance is improved.
Compression in Vector is automatic, requiring no user intervention. Vector chooses a compression method for each column, per data block, according to its data type and distribution.
Data Type Storage Format and Compression Type
Data types are stored internally in a specific format. The type of compression used depends on the data type and distribution.
Vector can use any of the following compression methods:
RLE (Run Length Encoding) – This method is efficient if many duplicate adjacent tuple values are present (such as in ordered columns with few unique values).
PFOR (Patched Frame Of Reference) – This method encodes values as a small difference from a page-wide base value. The term “Patched” indicates that FOR is enhanced with a highly efficient way to handle values that fall outside the common frame of reference. PFOR is effective on any data distribution with some value distribution locality.
PFOR-DELTA (delta encoding on top of PFOR) – This method makes integers smaller by considering the differences between subsequent values. PFOR-DELTA is highly effective on ordered data.
PDICT dictionary encoding (offsets into a dictionary of unique values) – This method is efficient if the value distribution is dominated by a limited amount of frequent values.
LZ4 – This algorithm detects and encodes common fragments of different string values. It is particularly efficient for medium and long strings.
Most INTEGER, DECIMAL, and DATE and TIME types internally are compressed using any of the first four compression methods.
FLOAT and FLOAT4 types are stored without compression in Vector tables.
Character types (CHAR, VARCHAR, NCHAR, NVARCHAR) of lengths larger than one are stored internally as variable width strings. This data can be automatically compressed using either a per-block dictionary or LZ4 algorithm.
NULL values are stored internally as a single byte column and are compressed using the RLE method. The null indicator, if needed, is represented internally as a separate column. Loading and processing of nullable columns can be slower than non-nullable columns.
Query Optimization
Analytical queries are the focus of Vector. It is essential that such queries have useful information about the characteristics of the underlying data because the amount of data accessed can be great, and the sizes of intermediate processing stages must be properly estimated to produce optimal query plans.
To allow the system to generate optimal query plans, we highly recommend running optimizedb or CREATE STATISTICS if 10% or more of the data in a table has changed since the last optimization. Vector will automatically construct histograms from live data for those columns that do not already have histograms stored in the catalog.
Data Manipulation
Vector provides a rich set of data updating capabilities.
Granularity of DML Operations
Vectorwise distinguishes between two granularities of DML operations:
Batch – Operations that involve a relatively small number (for example, thousands or tens of thousands) of records. Such updates are buffered in the system memory and merged with the data from disk on the fly.
INSERT, DELETE, and UPDATE are always executed as batch operations. COPY FROM and INSERT...SELECT are executed as batch operations if the destination table has an index declared and is not empty.
Bulk – Operations that involve many records (for example, one million or more), in which the data changes do not necessarily fit in the system memory.
CREATE TABLE AS SELECT is always executed as a bulk operation. Data load through vwload, COPY FROM and INSERT...SELECT are executed as bulk operations if the destination table has no index declared or if it is empty.
For more information, see Batch and Bulk Updates.
Persistence and Recovery
A COMMIT statement makes all changes from a transaction become persistent (including batch operations). In addition, Vector provides full automatic recovery. If a system crashes during a transaction, any changes made by that transaction will not be visible after the system restart, but all successfully committed transactions will be restored.
Management of In-memory Updates
Batch updates are stored in a special highly-optimized differential data structure, called the Positional Delta Tree (PDT), in main memory. Data is also written to disk immediately to ensure recoverability.
Batch updates that reside in memory can potentially consume a significant percentage of useful system memory. As a result, if the user is performing batch operations, Vector automatically propagates the changes buffered in memory to the disk-resident table. Propagation is triggered by factors such as table sizes and the number of DML operations buffered.
The volume of batch updates in such a propagation cycle should fit in main memory.
Update propagation can be a relatively expensive operation if there are many modified rows (especially in case of even distribution of updates across the table). Vector tries to rewrite only the affected blocks unless it would result in excessive table fragmentation.
The best practice is to avoid propagation to large tables (see How to Avoid Propagation) and use appends where possible.
The amount of memory reserved for Positional Delta Trees and the amount of differences after which propagation is triggered can be configured with system configuration parameters. For more information, see PDT Parameters in the System Administrator Guide. In cases with high update rates, the memory required for PDTs can become too high, in which case you will have to work with raw tables or use bulk operations, such as the MODIFY...TO COMBINE statement.
Transaction Isolation Model
Vector uses a snapshot isolation transaction isolation model with optimistic concurrency control.
Snapshot Isolation
Snapshot isolation is a transaction isolation model in which a given transaction always sees the same, consistent state of the system representing the moment when a transaction has started. As a result, phenomena seen in certain other isolation levels (dirty, non-repeatable, and phantom reads) are avoided. Snapshot isolation is closely related to multiversion concurrency control.
Snapshot isolation is slightly weaker than the serializable isolation level available in some systems.
The main benefit of snapshot isolation is that, while providing a high transaction isolation level, it avoids locking for read operations. Such locks can be detrimental to database performance. By combining it with the optimistic concurrency control model, it allows Vector to achieve high performance of update operations without affecting the performance of read-only queries.
For more information, see:
http://en.wikipedia.org/wiki/Snapshot_isolation
http://en.wikipedia.org/wiki/Multiversion_concurrency_control
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29
Optimistic Concurrency Control
Unlike some systems, Vector uses the optimistic concurrency control model. In this model, transactions can perform changes to the data without checking for conflicts possibly caused by other transactions that have committed in the meantime. Conflicts are checked only when a transaction is committed. This refers not only to conflicts caused by changing the same data, but also to constraint violations that occur only because of two transactions committing (for example, two concurrent transactions inserting a record with the same UNIQUE field).
Compared to other approaches, optimistic concurrency control avoids locking records for concurrently working transactions. This results in superior performance in cases where transactions typically do not conflict with each other.
The most visible aspect of optimistic concurrency control that sometimes causes problems for users accustomed to other solutions is that the conflicts are detected only at commit time. The application logic must take this into account.
For more information, see http://en.wikipedia.org/wiki/Optimistic_concurrency_control.
Transaction Examples
Non-dirty Read Example
This example demonstrates how committing transactions are not visible to concurrently running transactions. In this case, Transaction A will insert a value, but Transaction B will not see it until a new transaction is started.
Transaction A
Transaction B
BEGIN TRANSACTION;
SELECT A FROM T; --sees the old state
 
 
INSERT INTO T VALUES (7);
SELECT A FROM T; --sees the old
                   state + 
                   value 7
 
COMMIT;
 
 
BEGIN TRANSACTION;
SELECT A FROM T; --sees the same old state
 
 
SELECT A FROM T; --sees the old state without 7
 
 
SELECT A FROM T; --sees the old state without 7
COMMIT;          --new transaction starts
SELECT A FROM T; --sees the new
                   state including 7
Conflict Resolution Example
In this example, two transactions concurrently insert a record with the same value in a Primary Key attribute. This causes a conflict for the second committing transaction due to constraint violation.
Transaction A
Transaction B
--Empty Table T has a PRIMARY KEY on attribute PK
BEGIN TRANSACTION;
SELECT A FROM T; --sees an empty table
INSERT INTO T VALUES(6);
INSERT INTO T VALUES(7);
SELECT A FROM T; --sees 6 and 7
 
 
 
 
SELECT A FROM T; --sees 6 and 7
 
COMMIT;          --success
 
 
 
 
 
 
BEGIN TRANSACTION;
SELECT A FROM T; --sees an empty table
INSERT INTO T VALUES(5);
INSERT INTO T VALUES(7);
 
SELECT A FROM T; --sees 5 and 7
 
SELECT A FROM T; --sees 5 and 7
COMMIT;          --fails due to a
                   primary key
                   violation on 7
Note:  Concurrent updates on a table with a clustered index on foreign key columns will conflict; only one of the transactions will be allowed to commit. Others will return an error and be aborted.
Transactions and DDL Operations
Snapshot isolation and optimistic concurrency control are applied to operations on data (DML).
For operations that change the database schema (DDL), a different isolation model is used. In this model, transactions only see DDL changes performed by other transactions after they committed. Still, if there are conflicting DDL operations (for example, two transactions creating a table with the same name), the system locks the second transaction until the first transaction either commits or aborts.
SQL Access
Vector is a relational database with SQL access based on Ingres Database. The higher software layers—the layer database administrators, database users, or database applications interact with, such as management tools and the JDBC API—are identical to the traditional Ingres Database.
The difference from traditional Ingres is that the data is stored in Vector tables. This data is processed using the specialized Vector engine, allowing much higher performance for analytical database tasks.
Although it is possible to have both Vector and Ingres Database installations on the same server, this is not desirable. We recommend that Vector be installed on a dedicated server.
While a database can contain a mixture of both Vector and traditional Ingres tables, Vector does not support mixing both types of tables in the same query. You can, however, easily move data between Ingres and Vector tables directly and create tables in traditional Ingres or Vector using CREATE TABLE AS SELECT or INSERT...SELECT statements.
VECTORWISE Storage Type
By default, Vector uses the VECTORWISE storage structure.
The VECTORWISE storage type differs from traditional Ingres storage types in the following ways:
Queries accessing VECTORWISE tables use an alternative execution module that is optimized for analytical processing.
A query that accesses tables using VECTORWISE storage cannot access tables using other Ingres storage types.
Some operations are not available when using the VECTORWISE storage type.
Storage Structures
A Vector table can be created with one of two storage types:
VECTORWISE
(Default) Stores data in columns. Puts as few columns as possible in one disk block (in most cases only one).
VECTORWISE_ROW
Stores data in rows. Puts as many columns as possible in one disk block.
In one block, values are kept in compressed vectors. Potentially, all columns in a row can be stored in a disk block. This type is useful for small tables, especially those with many columns; it saves disk space and disk cache memory.
The result_structure parameter in config.dat sets the default storage structure for the Vector instance. When a table is created, the default storage structure is assumed.
To override the default, specify SET RESULT_STRUCTURE for the session or use WITH STRUCTURE=vw_structure in the CREATE TABLE statement.
You can create a traditional Ingres table in a Vector instance by overriding the default storage setting on a table basis by using WITH STRUCTURE = HEAP in the CREATE TABLE statement.
Note:  You must be licensed to use HEAP tables in Vector.
Default Database Characteristics
Vector stores CHAR and VARCHAR data types as UTF-8, using UCS-2 with UCS_BASIC collation. The default normalization form is NFC, but it can be changed to the less common NFD using createdb ‑n.
Transaction Management
By default, autocommit is turned off, which means each transaction consists of multiple statements ended by either a COMMIT or ROLLBACK statement.
If an error occurs during the execution of an SQL statement, the default is to roll back the statement. You can change this behavior and roll back the entire transaction if an error occurs by issuing SET SESSION WITH ON_ERROR = ROLLBACK TRANSACTION.
To determine the status of transaction error handling, issue the SELECT DBMSINFO('ON_ERROR_STATE') statement.