User Guide : 7. Working with Tables : Loading Data into Actian Vector : Import Data into an Actian Vector Table with the Import Wizard
 
Share this page                  
Import Data into an Actian Vector Table with the Import Wizard
Note:  This option is not available for Ingres databases.
The Import Wizard lets you load data from delimited flat files into an Actian Vector table using a guided process. The wizard prompts you to select one or more data files, map the source files to the destination, and initiate the import.
Data source files may reside:
Locally (on the machine where Director is running)
On the remote server where Actian Vector is running
In a Hadoop Distributed File System (HDFS) (for a Vector in Hadoop installation)
For this option to be available:
An Actian Vector installation must be connected (Connecting to an Ingres or Actian Vector Installation).
An Actian Vector database must be selected in the Instance Explorer.
The Import Wizard consists of the following seven pages, which guide you through the process of loading data into an Actian Vector database:
Wizard Page
Description
Welcome
Welcomes you to the Import Wizard and provides a high level overview of the available functionality
Choose a Data Source
Lets you select and configure files to import
Configure Columns
Lets you choose row and column delimiters for the input files and provides a preview
Destination
Prompts you to enter the destination for the data
Data Mappings
Guides you to configure mappings from the input files to the destination table
Complete the Wizard
Lets you configure the runtime options and confirm the tasks to perform
Execution
Imports the data and displays the success status after completion
To get started
Go to Start the Import Wizard.
Start the Import Wizard
You can access the Import Wizard the following ways:
From the Start page (Import Data into Actian Vector)
From the Database ribbon tab (Import Data)
From an Actian Vector database context menu (Tasks, Import Data…)
Depending on how you start the wizard, the wizard uses any available contextual information from the Instance Explorer.
To start the Import Wizard
1. Open the wizard in one of the ways explained previously.
2. (Optional) On the Welcome page, if shown, choose whether you want to display this page when you subsequently use the wizard.
For more information, see Environment Options (Options Dialog) and Set Connection and Startup Options.
Next step
Go to Choose a Data Source.
Choose a Data Source
On the Choose a Data Source page, you select and configure the file or files to be imported.
This page displays the instance on which the Import Wizard is being run. This field is read-only, obtained from the context from which the Import Wizard was invoked. The files you may select must reside locally in the target installation.
If you want to use the wizard to import data to a different Actian Vector instance, you first must connect to that instance and start the wizard again in one of the ways explained previously (see Start the Import Wizard).
The files you may select can reside locally on the machine where Director is running, or on the remote machine where the target installation is located. For Vector in Hadoop installations, the files may also reside in a Hadoop Distributed File System (HDFS) either on the cluster with Vector in Hadoop or another remote HDFS location. Director supports loading multiple files from any of these supported locations at the same time. The connection string for an HDFS file follows this format:
hdfs://hostname:port/full_pathname_to_file
To choose a data source
1. Click one of the Add buttons to select one or more data source files on the machine where the Actian Vector instance is installed or on which the HDFS resides.
Note:  If you specify multiple data files, they all must have the same number of columns since they will be imported to one target table. (If you need to import data to multiple target tables, you must run the wizard once for each target table.)
To remove a listed data source, select it and then click Remove.
2. If you specified more than one data source, select whether you want to load the data in parallel mode. (It is selected by default.)
3. Select the input character set. The default is the character set of the target installation.
4. For Linux only:
a. Optionally, specify a character to be substituted for any invalid input characters encountered during character set conversion.
Specifying a substitute character allows records that contain invalid characters to be loaded successfully. If you specify no substitute character, an invalid input character generates an error on import.
b. Optionally, specify if you want to open input files in text mode with no newline conversion.
Next step
Go to Configure Columns.
Configure Columns
On the Configure Columns page, you indicate the format of the delimited file by specifying row and column delimiters used in the data file.
The Import Wizard attempts to determine options automatically. If the default settings are acceptable, you may continue to the next page.
A preview is displayed, reflecting the options specified. Changing options triggers reparsing. If the preview requires a manual refresh, “Click here to attempt to parse the file(s)” is displayed; click this message to regenerate the preview. If you specified more than one input file, each preview table is displayed on its own tab.
If any problems are encountered during the parsing of the data file, a link to the exception is provided. Clicking it displays the exception message.
Verify that the settings are correct in the preview, which shows the first 100 rows of data. You cannot continue beyond this page until the file is successfully parsed.
Input files contain a header row
Specifies whether the input file(s) contain a header row.
Header row contains column names
Specifies whether the header row contains column names. This option is available only if “Input files contain column names” is selected.
Column Delimiter
Specifies the column delimiter used in the data file(s). If you select Other, you may specify your own delimiter, which must be a single character or valid escape sequence.
Row Delimiter
Specifies the row delimiter used in the data file(s). If you select Other, you may specify your own quote character, which must be a single character or valid escape sequence.
Quote Character
Specifies a quote character for the data file. If you select Other, you may specify your own delimiter, which must be a single character or valid escape sequence.
Escape Character
Specifies an escape character for the data file. If you select Other, you may specify your own escape sequence. For more information, see vwload Escape Sequences.
Null Value
Specifies a string that identifies null values. If you select Other, you may specify your own null character, which must be a single character or valid escape sequence.
Skip Records
Specifies the number of records to skip. If you select Other (by clicking down from 0), you may specify your own value.
Next step
Go to Specify the Data Destination.
Specify the Data Destination
On the Destination page, you enter the destination for the data. You may specify an existing Actian Vector database or create a new one. You also may specify an existing table or create a new one.
To specify the data destination
1. Specify the effective user for the current session. The default is the currently logged-on user.
2. Specify the database to load the data into:
Existing database
Specifies the database to load the data into. If you started the Import Wizard from the context menu of an existing database node, this database is the default. If only one database exists, it is selected by default.
New database
Creates a new database (see Create a Database). Afterward, the Database drop-down defaults to the newly created database and the “Create a new table to import data into” option is selected.
3. Specify the table to load the data into:
New table: Enter the name of a table to create, and set the options for it.
a. Select a default data type recognition algorithm:
Off – No attempt is made to recognize data types. A default data type of varchar(255) is used.
Normal – (Default) The data types chosen will be nullable, and rounding will occur.
Aggressive – The smallest possible data types will be used. The Import Wizard will attempt to determine if the columns appear to be nullable or not.
Note:  These options differ in convenience and accuracy. Normal is more likely to load data without requiring user readjustment; aggressive is more likely to get an appropriate data type.
b. Specify whether journaling should be enabled for the new table.
c. Specify whether duplicates will be allowed in the new table.
d. Select the page size of the new table.
e. Select whether you want to specify locations. If so, select the location from the drop-down (table level or column level).
f. Select whether you want to partition the table. If so, you will specify the partition locations on the Data Mappings page (see Configure the Data Mappings).
Existing table: Select a user table from the drop-down as a target for the imported data.
If no tables exist in the database, a warning message is displayed. In this case, you must create a table in the selected database.
Next step
Go to Configure the Data Mappings.
Configure the Data Mappings
On the Data Mappings page, you configure the mappings from the input data file to the destination table. The contents of this page differs depending on whether you chose to import data to a new table or an existing table; follow the appropriate procedure below.
There is a data mapping row for each column of the data file by default. The default data types and nullability information are determined using the data recognition algorithm chosen on the Destination page (see Specify the Data Destination).
The Import Wizard attempts to validate the data grid as you make changes. However, because it cannot properly validate the table until runtime, be aware that potentially you could configure a table that causes the import operation to fail.
To configure data mappings for an existing table
When you import data to an existing table, the Data Mappings page displays a preview of the target table.
1. Specify a global date format, if desired. (If not relevant for the table, this control does not appear.) Select a date format from the list.
If the target installation supports custom date formats, then you may select from the list.
If you do not specify a global date format, you must specify a format for each date column (see the next step).
2. Configure the data mappings in the data grid. The grid fields are defined as follows:
Source
Indicates the source column in the data file
Destination
Specifies the destination column in the existing table into which the data will be imported. Select a destination column for each row from the drop-down. If a column does not need to be imported, set it as “Ignored.”
Each source column must have a target destination assigned or must be ignored. Each destination column may be used once at most.
Data Type
Specifies the data type of the destination column. You may select alternate data types by clicking the secondary mouse button or entering the types manually.
Nullable
Specifies whether the destination column allows nulls
Default
Specifies the default mode for the destination column
Default Value
Specifies the default value for the destination column
Unique?
Specifies whether the destination column must have a unique value
Date Format
If one of the existing table columns has a date data type, indicates the date/time format. If there are no date columns in the target table, this column is not shown.
Valid values for format are: US, MULTINATIONAL, MULTINATIONAL4, ISO, ISO4, IS09075, SWEDEN, FINLAND, GERMAN, YMD, DMY, MDY, as described in
II_DATE_FORMAT in the Ingres SQL Reference Guide.
Default: US
Note:  For custom date formats, do not use the “+” sign in the Import Wizard. For more information about date formats, see “vwload Command—Load Data into an Actian Vector Table” in the Actian Vector User Guide.
3. Click the Preview Data button to preview the data to be imported according to the options you have selected.
The Preview Data dialog displays a read-only preview of up to 100 rows of each data file per tab, showing approximately how data will look after it has been imported.
If the preview shown is not what you want the target table to look like, return to the Data Mappings page to reconfigure.
4. Click the Preview SQL button to preview how the SQL CREATE TABLE command will look.
When you click Next, the wizard ensures that data mappings are consistent and appear to be valid.
To configure data mappings for a new table
When you import data to a new table, the Data Mappings page appears like a new table document from which you can configure the target table.
Each data mapping row is color-coded, depending on the type of mapping:
Color
Column Type
Description
Black
Data file
An existing column in the data file
Gray
Ignored
A column set to be ignored in the import
Green
New
A column not in the data file that was created in the wizard
The data mappings grid implements a context menu to help you configure the new table. You access the context menu with the secondary mouse button. The context menu contains the following options:
Set Primary Key
Sets the current data mapping row (table column) as a the primary key for the table
Insert Column
Inserts a new data mapping row (table column) that has no corresponding column in the data file
Delete Column
Deletes a data mapping row (new table column). This option is available only for new (green) columns that are not in the data file.
Set Ignored
Flags a data mapping row (table column) to be ignored for import. This option is available only if the column is from the data file.
Destination
Lists some possible names for the destination table column
Data Type
Lists some possible data types for the column. This lets you view potential data types, including data types generated using different data type recognition algorithms.
To configure data mappings for a new table
1. Specify a global date format, if desired. Select a date format from the list.
If the target installation supports custom date formats, then you may select “Custom...” from the list and define the custom date format:
a. Click the down arrow to show custom date format samples.
b. Click a sample to copy the corresponding data into the date format field.
If you do not specify a global date format, you must specify a format for each date column (see the next step).
2. If you specified that the new table be partitioned on the Data Destination page (see Specify the Data Destination), specify at least one partition location in the Table Partitioning section.
3. If you specified table- or column-level locations on the Data Destination page (see Specify the Data Destination), specify the location in the Locations section.
Type in the DATA location for the table or columns, or click Select (pencil image) to open the Select Locations dialog.
If you specified column-level partitioning, a Location column is added to the data mappings grid.
4. Configure the data mappings in the data grid. The grid fields are defined as follows:
Source
Indicates the source column in the data file
Destination
Specifies the destination column in the table where the data will be imported. The Import Wizard chooses an appropriate default, but any valid destination column name can be specified. You may select alternate data types by clicking the secondary mouse button or entering the types manually.
Data Type
Specifies the data type of the destination column. You may select alternate data types by clicking the secondary mouse button or entering the types manually.
Nullable
Specifies whether the destination column allows nulls
Default
Specifies the default mode for the destination column
Default Value
Specifies the default value for the destination column
Unique?
Specifies whether the destination column must have a unique value
Date Format
Specifies the date time format, if applicable, of the data in the file to be imported. This field is enabled only if you have not specified a global date format and if the data type for the column is a date.
Valid values for format are: US, MULTINATIONAL, MULTINATIONAL4, ISO, ISO4, IS09075, SWEDEN, FINLAND, GERMAN, YMD, DMY, MDY, as described in
II_DATE_FORMAT in the Ingres SQL Reference Guide.
Default: US
Note:  For custom date formats, do not use the “+” sign in the Import Wizard. For more information about date formats, see “vwload Command—Load Data into an Actian Vector Table” in the Actian Vector User Guide.
Location
(For columns with a specified location) Specifies the default location of the partitioned column. This field is editable. Clicking the Select Locations button opens the Partition Column(s) DialogPartition Column(s) DialogPartition Column(s) Dialog, where you can select the columns to partition on.
Partition
(For partitioned tables) Specifies whether the column is partitioned.
5. Use the context menu to configure each of the rows (table columns) in the data mappings grid.
6. Click the Preview SQL button to preview the SQL statement that will be used to import the data.
The SQL is displayed and, by default, printed with color coding. The text cannot be directly edited; to change the CREATE TABLE statement, return to the Data Mappings page to reconfigure.
The Preview SQL dialog contains the following fields and controls:
Pretty Print the SQL
Specifies whether the SQL should be pretty printed. The default is to pretty print: format the code with capitalization, indenting, and color-coding.
SQL preview pane
A read-only text box containing the SQL that will be used to create the new table
7. Click the Preview Data button to preview the data to be imported according to the options you have selected.
The Preview Data dialog displays a read-only preview of up to 100 rows of each data file per tab, showing approximately how data will look after it is imported.
If the preview shown is not what you want the target table to look like, return to the Data Mappings page to reconfigure.
When you click Next, the wizard ensures that data mappings, locations, and partitions are consistent and appear to be valid.
Partition Column(s) Dialog
The Partition Column(s) dialog lets you select the columns on which to create partitions. It displays every column that has already been created in the new table tab. This dialog contains the following fields:
Columns
Lists each column as a checkbox. Select each column you want to create a partition on.
OK
Saves your selected settings and closes the dialog.
Cancel
Closes the dialog without saving any changes.
Next step
Go to Complete the Wizard.
Complete the Wizard
On the Complete the Wizard page, you can set the runtime options for the data load. The page also confirms the data load and presents a summary with details about all the options chosen throughout the wizard.
This page contains the following fields and controls:
Rollback on errors
Specifies whether a rollback should be performed if errors are encountered during the import.
Show verbose output
Specifies whether to include verbose output.
Frequency
Specifies the frequency for the verbose output. This option is shown only if the option is supported by the target installation and if the “Show verbose output” option is selected. You may enter a frequency value and a corresponding unit.
Enable timing information
Specifies whether timing information should be shown.
Terminate after a number of failed records
Specifies whether to terminate the import process when the specified number of failed records is reached.
Caution!  We recommend setting this option because, if not set and errors are encountered, it can cause very long-running data loads.
Failed records count
Specifies the number of failed records triggering termination. This option is enabled only if the “Terminate after” option is selected.
Log failed records to a file on server
Specifies whether failed records should be logged to a file on the server.
Log file path
A default failed records log file name is generated when the option is selected. You may select or specify a different file, if desired. This field contains the path of the log file on the server. This field is enabled only if the “Log failed records” option is selected.
Browse button
Launches a dialog from which you can choose a log file on the server. This option is enabled only if the “Log failed records” option is selected. Browsing is local to the target installation, not the Director client.
Advanced options
Opens the Advanced Options dialog, where you may specify additional runtime options. These options are not commonly used and are for compatibility with other RDBMSs.
Interpret MySQL/PostgreSQL data escape sequences
Interprets data escape sequences as generated by MySQL and PostgreSQL. Setting this option also allows setting \N as the NULL representation.
Use strict null value checking
Uses strict NULL value checking. This distinguishes between plain and quoted or escaped occurrences of a NULL representation. For example, NULL is a NULL value, whereas "NULL" is a 4-character string value. By default, vwload does not make this distinction. This option allows proper loading of some data generated by MySQL and PostgreSQL.
Summary
Read-only area that provides a summary of the options chosen throughout the wizard.
To complete the wizard
1. Set the desired runtime options.
2. Check the summary to confirm your previous selections.
If you need to change any settings, click the Back button to return to the page where you set them.
3. Click Finish.
The wizard displays the Execution page (see Monitor Import Execution) and begins the data import process.
Monitor Import Execution
The Execution page of the Import Wizard shows output from the import wizard and provides a progress indicator while data is being loaded.
After import has completed, the output from the vwload command is displayed. This page consists of these main tabs:
Messages
Contains any output from the Import Wizard. After the vwload operation completes, the success status and any other available information about the import is shown.
Output
Displays the raw output of the vwload command after vwload execution completes.
If the import fails, you can view failed records and optionally go back to previous pages to change options and then attempt re-importing the data.
Execution States
The possible states of the Execution page are as follows. After execution completes, the Output tab is enabled so that you may view the raw output from the vwload command.
Executing
Indicates that the vwload command is running and the data load is currently happening. A progress indicator shows that the command is still running.
The Execution was Successful
Indicates that the data load completed successfully.
After a successful data load, you cannot go back to previous pages of the wizard; you must close the dialog. To load more data into Actian Vector, you must invoke the Import Wizard again (Start the Import Wizard).
Execution Warning
Indicates that the data load completed successfully, but vwload reported errors.
You may go back to change options and then retry the data load. If you chose to log failed records to a file (see Complete the Wizard), you may view them in the log file you specified by clicking Failed Records (see Failed Import Records).
The Execution Failed
Indicates that the data load has failed.
You may go back to change options and then retry the data load. If you chose to log failed records to a file (see Complete the Wizard), you may view them in the log file you specified by clicking Failed Records (see Failed Import Records).
How the Import Wizard Handles Newly Created Tables
If a new table was created during import and the Import Wizard enters a “Warning” or “Failed” state, you are prompted to delete or retain the newly created table. Clicking Yes deletes the table; clicking No leaves the newly created table intact.
If you click the Back button, the Import Wizard page that is displayed depends on whether you deleted the newly created table. If the table was deleted, the Complete the Wizard page is displayed (see Complete the Wizard). If the table still exists, the Destination page is displayed (see Specify the Data Destination), and its options default to a data load into this new table.
Failed Import Records
If the Actian Vector data import process does not complete successfully, you may click the Failed Records button to open the Failed Records dialog to view the records that did not import. The Failed Records dialog is available only if you specified the option to log failed records to a file on the Complete the Wizard page (see Complete the Wizard).
To save the records locally to a file, click Save Locally As. This button launches a file browser from which you can select a path and provide a name for the file.
vwload Escape Sequences
To specify control characters in the vwload command, you must use an escape sequence. An escape sequence is initiated by a \ character. Valid escape sequences are:
Escape Sequence
Description
\a
Bell (alert)
\b
Back space
\f
Form feed
\n
Newline
\r
Carriage return
\t
Tab
\v
Vertical tab
\nnn
The character with octal code value nnn
\uxxxx
The 2-byte Unicode code point with hexadecimal value xxxx
\\
\ character
Note:  (Vector only) Certain special characters, such as \, ", ‘, and |, must be protected from interpretation by the command shell by using the appropriate quoting and escaping mechanisms provided by the shell. This does not apply to Actian Director, which automatically takes care of such formatting.