Vector 7.0 | New Features in Vector Version 7.0

Currently, external tables can connect to a single data source, such as a CSV file or a remote Vector table, with a table representation on the Vector side and allowing users to perform basic ETL tasks.

This new feature includes the following:

• enables the specification of multiple SparkSQL statements or full-fledged Scala scripts

• extends the external table logic to allow for referencing and combining multiple data sources and

• supports all functionalities and libraries that Spark offers, including Machine Learning, in order to perform complex ETL tasks.

For more information see, CREATE EXTERNAL TABLE in the SQL Language Guide.

Ad hoc Auto Partitioning

This feature alters the behavior of the x100_partition_scheme “Auto” to use different partitioning schemes. The x100_partition_scheme dbms parameter determines the default partition scheme to be used if the WITH PARTITION clause is not specified when an X100 table is created.

Automatic Cleanup of Unused Files and Enhancements to Manual Cleanup Process

Any unused files generated in the X100 server are now automatically cleaned up when the X100 server starts. For more information on the new configuration parameters added for this feature, see [cbm] Settings in the System Administrator guide.

The existing process of manual cleanup of unused files is also enhanced. For more information, see CLEANUP_UNUSED_FILES Command in the SQL Language Guide for the new options added to the command.

BYTE and VARBYTE Support for X100 Tables

BYTE and VARBYTE binary data types are now supported for X100 tables. These data types store binary data such as graphics, which cannot be easily stored using character or numeric data types.

Additionally, new SQL functions are added and existing functions are updated to support BYTE and VARBYTE values on X100 tables.

For more information, see Data Types, Binary Functions, Conversion Functions and Bitwise Aggregate Functions in the SQL Language Guide.

Cloud Checkpoint Locations

This feature adds the ability to create and store a checkpoint on a cloud provider. Cloud checkpoints are created using the open source rclone utility. See Cloud Checkpoint Locations in the Database Administrator Guide.

For more information, see Cloud Checkpoint Locations in the User Guide.

Containerized UDFs

Containerized Python and Java UDFs are release features that can be used in a production environment. You can choose to install them automatically while installing the product.

COPY VWLOAD Load Statistics and Results Statistics

During execution of a COPY VWLOAD query, the COPY VWLOAD Load Statistics feature shows the current progress status of the loading process or shows if the load has hung up. After the load job completes, the Results Statistics can be queried after executing the COPY VWLOAD query to obtain the load statistics for the particular query.

Credential Management

A new management of encryption keys for database encryption supports the following new options for the SQL statement ALTER PASSPHRASE:

• The option NEWKEY now is truly optional. Not specifying WITH NEWKEY changes just the database passphrase, but not the database encryption key that is protected by the passphrase.

• The new option STASH allows the database passphrase to be internally stashed. This avoids the need to unlock the database, e.g. after a system restart. A database with a stashed passphrase cannot be locked.

• The new option PBKDF allows the specification of parameters for the Password Based Key Derivation Function (PBKDF). These parameters help the administrator to adapt the security to new requirements.

For more information, see Encryption Key Management in the Security Guide.

Download UDF Container Image from Repository

You can download the UDF container image from the Docker repository during or after installation.

For more information, see iisuudfc Command--Download UDF Container Image from Repository in the User Guide.

ELEMENT and ELEMENT_COUNT Functions

The ELEMENT function splits the input string into elements separated by a specified separator and returns the specified number of elements concatenated by separators starting or ending at an element of a given index.

The ELEMENT_COUNT function splits the source string into elements separated by a specified separator and returns the number of elements within input source string.

For more information, see ELEMENT Function and ELEMENT_COUNT Function in the SQL Language Guide.

Extend Pattern-Matching Capabilities

This release introduces the implementation of LIKE_REGEX SQL standard predicate on the X100 engine, enabling more powerful query filtering. Furthermore, the X100 engine is also provided with the support for SUBSTRING functionality with pattern-matching predicates such as LIKE, SIMILAR TO, BEGINNING, ENDING and LIKE_REGEX, allowing for a more precise data retrieval and manipulation.

For more information, see Pattern-matching Predicates in the SQL Language Guide.

Federated Identity Management

This feature allows users to access multiple networks or domains by using a single set of credentials. By using this single set of credentials, users can be authorized between different services with ease.

For more information, see Implementing FIM in the Security Guide.

FIRST N AFTER M

This feature enables the ability of the FIRST n qualifier to be used in sub-selects as well as the main query and it is widened to include the offset capability using FIRST n AFTER m as a SELECT qualifier.

For more information, see First N After M Clause in the SQL Language Guide.

iilogstat and iilockstat Commands

The iilogstat and iilockstat commands are identical to the existing logstat and lockstat commands and display information from the locking and logging system. The only difference is that the iilogstat and iilockstat commands can be run only by the installation owner and do not require the database server to be up and running.

For more information, see lockstat/iilockstat Command--Display Locking Status and logstat/iilogstat Command--Display Logging Status in the User Guide.

Inbound/Outbound Encryption Mode ‘ON’ By Default

For all new installations of Vector 7.0, the inbound and outbound encryption mode (ib_encrypt_mode and ob_encrypt_mode) is set to ON by default. There are no configuration changes for existing Vector installations.

For more information, see Configure AES Encryption (Ingres Net and GCA Remote Access) in the Security Guide.

IS BOOLEAN Predicate

The IS BOOLEAN predicate can be used to determine if a value can be converted into a Boolean data type.

For more information, see IS BOOLEAN Predicate in the SQL Language Guide.

Lock Encrypted Database without Passphrase

This feature adds the ability to lock an encrypted database in an emergency situation. The database owner can run the DISABLE PASSPHRASE statement without specifying the passphrase.

For more information, see DISABLE PASSPHRASE in the SQL Language Guide.

Machine Learning Model (MLM)

This feature captures the integration of popular ML Libraries with the goal to perform model scoring/inference in X100. A Machine Learning Model can be created using the Create model in Tensorflow and it is available only for UNIX installations.

For more information, see Configuring Machine Learning Models (MLM) in the System Administrator Guide.

Password Policy

This feature applies password policy rules regarding password length, characters, and patterns employed.

For more information, see Password Policy Configuration in the Security Guide.

Performance Improvements

Vector 7.0 has been evaluated for performance against TPC-H and TPC-DS workloads. The performance validation for TPC-DS shows an improved runtime of about twenty five percent as compared to Vector 6.3. The performance validation for TPC-H shows comparable runtimes across Vector 6.3 and Vector 7.0.

The new "Smart MinMax" feature provides the same expected query performance while reducing the memory pressure of MinMax indices significantly. The "WAL Reply Optimizations" have revealed significantly reduced startup times for the Vector 7.0 X100 server as compared to Vector 6.3.

GCC Het-Net processing:

The performance enhancements for GCC Het-Net Processing include:

• Simplified single-byte charset processing by separating from double-byte and multi-byte processing.

• Negotiate transport format for integers and floats to utilize server format where possible.

• Charset transliteration is now performed on the client in most cases.

• Optimized varchar padding compression for query parameters on client to eliminate need for heterogenous data processing on server.

• Enable varchar padding compression for repeated queries as is done for regular queries.

• Eliminate heterogeneous data processing on server when the processing can be done entirely on the client and the negotiated transport formats and charset match the server instance.

GCC & GCD Stacks:

The Net Server functionality has been incorporated into the Data Access Server (DAS). DAS can now handle both Ingres/Net and thin client (JDBC/.Net) connections, either on a single port or can be configured to support multiple ports including standard Net and DAS ports.

Protected_User Privilege

This feature allows a user to have the “Protected_User” privilege. This privilege is assigned to the installation owner and root user (in Linux) during installation. A user with the “Protected_User” privilege cannot be dropped, altered or have its password changed by other users, except the installation owner and self.

For more information, see Protected_user Privilege in the Security Guide.

Remote File System Support

This feature adds support for different URL schemes, such as gs, adl, adls, abfs, abfss, s3a, s3n, http, https, hdfs, maprfs, and viewfs, to COPY VWLOAD (SQL load), vwload (command line load in cluster and non-cluster mode) and INSERT INTO EXTERNAL CSV (SQL export) by enabling and deploying Apache Hadoop file system drivers for Vector.

Security Improvements

Enhanced security mechanisms for more secure access and connectivity to the database.

Smart MinMax

The SmartMinmax feature permits the database to actively select a current subset of MinMax index columns. Vector drops unnecessary index columns and adds necessary columns to the MinMax index according to their predicted selectivity, with respect to upcoming filter queries, based on the set of filter queries seen so far.

This feature reduces memory utilization, while attempting to maintain run time performance, by eliminating index columns that are rarely queried or do not have a high (predicted) selectivity.

For more information, see Smart Minmax in the SQL Language Guide.

Spark UDFs

The Spark UDFs feature adds Scala as a UDF language to Vector. Scala UDFs are executed through the Spark Provider container which can be downloaded using the iisuspark command.

For more information, see iisuspark Command--Set up and Configure Spark for Vector in the User Guide.

String Truncation Warn for X100

X100 table structures now support String Truncation Warn.

For more information, see SET Options in the SQL Language Guide.

Support Rename Column on Partitioning Keys

Renaming partitioning key columns in the partitioned tables is now supported for X100 tables. When an X100 table is created with "hash" partitions, the columns on which the hash partition is set can be renamed.

For more information, see Rules and Restrictions on Renaming Columns in the SQL Language Guide.

Table Cloning

Cloning a table is functionally equivalent to creating a full copy of an X100 table. Once the clone is created, they are two separate, independent tables. Subsequent modifications or alterations to one do not affect the other.

For more information, see Table Cloning in the SQL Language Guide.

Table Synonym

Table Synonyms can now be used to load data into the table.

Update Propagation Enhancements

Update propagation (UP) is enhanced to prioritize Insert-Only Update Propagation (IOUP), improve UP trigger thresholds and retry mechanisms when an UP fails, introduce trigger UP mechanism on shutdown, and execute IOUP as the sole write transaction.

For more information on the new configuration parameters added for these enhancements, see PDT Parameters in the System Administrator guide.

Usability Improvements

Improved error messaging for Syscall Setconf.

UUID Support in the .Net Provider

This feature adds the ability to support the IngresType “UUID” to the .NET Types.

UUID Support in the ODBC Driver

This feature adds the ability to support the “UUID” data type to the ODBC driver.

Vector Installation for Linux Enhancements

While installing Vector for Linux, you can now download and configure Google Cloud Storage Connector, download the container image, downlaod the Spark container, download the Tensorflow container, and activate configuration for containerized user-defined functions (UDFs).

For more information, see install.sh Command--Install Vector in the Getting Started guide for the new options added to the command.

Last modified date: 12/19/2024