Storage Innovations--Beating the Disk Bottleneck
Any database system with such a high computational speed runs the risk of becoming I/O bound. For this reason, the second major component of Vector consists of storage innovations designed for high I/O throughput. These innovations include:
• Columnar data layout
• Advanced compression
• Storage indexes
The Vector storage mechanism uses columnar data layout, which allows analytical queries to avoid disk access for columns not involved in a query. While you can generally think of Vector storage as a column store, Vector can mix columnar and row-based storage so that certain columns that are always accessed together get stored in the same disk block. Layout decisions are handled automatically by the system, but can also be controlled by the user.
To further avoid I/O becoming a performance bottleneck, Vector introduces a number of advanced compression schemes. These schemes are designed for fast decompression. Therefore, accessing compressed data in Vector means that less data needs to come from disk, yet queries do not slow down due to decompression.
Finally, Vector uses storage indexes. The storage indexes are small and store the minimum and maximum value per data block. The storage index, which is automatically created and maintained, enables the execution engine to rapidly identify candidate data blocks.