vwload Command--Load Data into a Table
The vwload command loads data into a Vector table.
By default, errors do not stop the loading process; however, when an error is encountered, the load is rolled back. Use command options to modify the default behavior.
Note: Using vwload to load data into external tables is not supported.
This command has the following format:
vwload --table tablename [vnode::]dbname [options] datafile ...
-t, --table tablename
Specifies the name of the table that data is to be loaded into.
[vnode::]dbname
Specifies the ID of the remote node (if connecting to a remote server) and name of the database.
datafile
Specifies the names of the files that contain the data being loaded. File names are specified as a space-separated list from one to many. An entry can be a file name, a file name with an asterisk as a wildcard, or a directory. If you specify a directory, all files in the directory are loaded. If an asterisk is used in a file name, enclose the entire file name in single quotes. Valid use of wildcards in file names is as follows: *, abc*, *abc, abc*def, *abc*.
options
Specify options, as follows:
-a, --attributes attr1,attr2...
Specifies list of attributes (columns) to load. An empty attr name indicates an input field that is to be ignored.
Default: All attributes are loaded if no ‑‑attributes argument is specified.
-B, --rollback on|off
Turns roll back on error on or off. If set to off, the load transaction will not be rolled back when errors are encountered, causing partial data to be loaded.
Default: on
-c, --cluster
Speeds loading by parallelizing some of the processing steps.
-C, --charset charset
Default: no conversion
-d, --dateformat format | attr=format,...
Sets date or timestamp format for the attribute (column). When no attribute is indicated, the value applies to all attributes that are not otherwise set.
Alternatively, a custom format string can be specified by starting the format with a '+' sign followed by the format string. The format string can consist of any of the format specifier characters used on the DATE_FORMAT function, described in the Vector SQL Language Guide.
To be able to use any character in the format string (including for example, "," and "="), you must quote the format string in the same way that you would quote a delimited identifier on the command line (for example: 'ColX="+%M,%Y,%d",ColY=US'). For more information on delimited identifiers, see Regular and Delimited Identifiers in the Vector SQL Language Guide.
Default: US
Examples:
vwload -d col1=+'%d-%b-%y' -t table1 mydb date.in
vwload -d 'Col1=US,Col2=+%M %Y %d' -t table1 mydb2 loadfile.txt
-e, --escape escapechar
Specifies the escape character to use. This allows escaping of single characters to allow a delimiter character to be part of a field (for example: \) or to allow a quote character to be part of a quoted string.
The argument must be a single ASCII character or an empty string. If the argument is an empty string, this functionality is disabled. To specify a control character, use an
escape sequence (see
vwload Escape Sequences).
Default: none
-E, --escapes
Interprets data escape sequences as generated by MySQL and PostgreSQL. This also allows setting \N as the NULL representation.
Note: The \0, \digits, and \xdigits escape sequences are not supported.
-f, --fdelim fielddelim
Specifies field delimiter to use. The delimiter must be a single character. To specify a control character, use an
escape sequence (see
vwload Escape Sequences).
Default: "|"
‑F, --frequency frequency
Sets the frequency of verbose progress reports, in number of records. Optionally append one of k K m M g G as magnitude multiplier. For example: 5M means print a report every 5 million records.
--generate_opt_file
Generates a template of an options file to stdout.
-h, --help
Displays syntax information.
-H,--header
Skips header line in files.
-i--ignfirst
Ignores the first field.
-I,--ignlast
Ignores the last field.
-l, --log path
Logs rejected rows and corresponding errors to the specified file (cluster mode off) or to multiple files in specified directory (cluster mode on). The file is created by vwload. When used with the ‑‑cluster option, specify a directory name instead of a file name. For details, see
vwload in Parallel Mode.
-L, --credential_path
Specifies the file containing credential information to load data from cloud file storage systems like Amazon Web Services (AWS), Google Cloud Storage (GCS) and Azure. This option takes an absolute path pointing to a text file in a local file system.
Listed below are examples of sample credential types to be used in the credential file for the different cloud file system providers:
AWS_ACCESS_KEY=my_access_key
AWS_SECRET_KEY=my_secret_key
AWS_SESSION_TOKEN=session_token
AWS_REGION=us-east-2
AWS_ENDPOINT=s3.us-east-2.amazonaws.com
AZURE_CLIENT_ENDPOINT=https://login.microsoftonline.com/tenant_id/oauth2/token
AZURE_CLIENT_ID=client_id
AZURE_CLIENT_SECRET=client_secret
GCS_EMAIL=name@mail.com
GCS_PRIVATE_KEY_ID=private_key_id
GCS_PRIVATE_KEY=-----BEGIN PRIVATE KEY-----\n<long_key>\n-----END PRIVATE KEY-----\n
Note: The credential file for AWS must contain a combination of the different credential types and the AWS_SESSION_TOKEN specifies the session token value required if temporary security credentials are used.
-m, --timing
Enables timing information. Timing statistics will be printed at the end. In verbose mode, they will be printed for every progress update.
-M, --money currency-symbol
Currency symbol checked while loading or parsing data into money data type columns. If a currency symbol other than the defined value is used, then data will not be loaded.
Default: "$"
-n, --nullvalue nullvalue
Defines the string that identifies NULL values.
Default: ""
By default, an empty string is treated as a null unless -n option is used. For example, suppose the following contents in data file a1.csv:
1|\n|
2|a|b
3||c
If we use vwload -t t1 -n '\n' a1.csv, then 3rd row, 2nd column will have an empty string and not null; 1st row, 2nd column \n will be loaded as null as defined in -n flag.
If we use vwload -t t1 a1.csv (without -n), then 3rd row, 2nd column will be loaded as null; 1st row, 2nd column will be loaded as '\n'.
-N, --strictnulls
Uses strict NULL value checking. This distinguishes between plain and quoted or escaped occurrences of a NULL representation. For example, NULL is a NULL value, whereas "NULL" is a 4-character string value. By default, vwload does not make this distinction. This option allows proper loading of some data generated by MySQL and PostgreSQL.
–nonchar_leading_spaces_null
When –nullvalue is used, data may contain spaces in front of the null identifiers. This option applies to non-character columns only. For example, if --nullvalue NADA is specified, –nonchar_leading_spaces_null will interpret a DATE column containing “NADA” or “ NADA” as NULL.
--notnull_empty
Keeps empty strings. Does not consider the input value NULL if it is empty (that is, contains two consecutive field delimiters). For VARCHAR, the value becomes an empty string; for CHAR the value becomes blanks. This option applies only to columns that are NOT NULL and of type CHAR, NCHAR, VARCHAR, or NVARCHAR.
-p, --profile file
Writes server side profiling data to file.
-P, --password password
Specifies the user password, not valid when using DBMS authentication..
Note: Specifying the password on the command line is insecure because the command line can be viewed by other users with programs such as ps. Vwload overwrites the password on the command line with zeros during program initialization to reduce the security risk, but a short time still exists where the password is visible to other users. It is more secure to let the program prompt for the password.
Default: Prompts on TTY when needed.
-q, --quote quotechar
Specifies the quote character(s) to use. This allows the input to contain quoted fields (for example "Doe, John"), which may contain field or record delimiter characters. To include a quote character inside a quoted field, enter it twice in the input--for example: "The ""BIG"" Boss". When using distinct open and close quote characters, enter only the close quote character twice--for example: [The [BIG]] Boss].
The argument must be one or two ASCII characters or an empty string.
If the argument is an empty string, this functionality is disabled. If the argument is two characters, the first character is used as the opening quote character, and the second as the closing quote character. For example, specify "[]" to allow [quoted string].
Default: none
-r, --rdelim recorddelim
Specifies record delimiter to use. The delimiter must be a single character. To specify a control character, use an
escape sequence (see
vwload Escape Sequences).
Default: "\n"
-R, --rowmode
Inserts data through the PDT (in-memory) updates, rather than directly appending the data in bulk to the table's data blocks on disk. This allows concurrent inserts, at the cost of decreased performance at high bulk volume, increased memory consumption. At the time of the load, the whole volume of loaded data will have to fit in memory, and only later it will be propagated to the table's blocks on disk through update propagation.
-s, --skip n
Specifies the number of records to skip.
Default: 0
-S, --substitute substitutechar
Substitutes the specified character for any invalid input character during character set conversion. When no substitute character is provided, an invalid input character is considered an error condition. Specifying a substitute character allows records that contain invalid characters to be loaded successfully.
The argument must be a single Unicode character or an empty string. If the argument is an empty string, the functionality is disabled.
To specify a control character or Unicode code point that cannot easily be typed on the keyboard, use an escape sequence.
-T, --textmode
Opens input files in text mode and does not perform newline conversion. On Linux, this may gain a 5 to 10 percent performance improvement. On Windows, there is no known advantage to using this option. (By default, vwload opens files in binary mode and performs automatic newline conversion, transparently handling text formats of various operating systems.)
Note: When using -T on Windows, any ^Z (ASCII SUB) character in the input is interpreted as End-Of-File by the Windows library, and causes vwload not to see any data following the first such character.
-u, --user username
Specifies the effective user for the session.
+user=authuser[,password]
Specifies the user name and password used for DBMS_authentication.
-v, --verbose
Prints progress report.
-x, --errcount n
Terminates after n errors.
When used with the ‑‑cluster option, terminates after the first n errors in all input files. For details, see
vwload in Parallel Mode.
Default: 0 (do not terminate)
-z, --stats
Creates statistics for the table. Builds histograms for all columns of the loaded table.