A. Command Reference : vwload Command--Load Data into a Table : vwload in Parallel Mode : Limitations of Parallel vwload
 
Share this page                  
Limitations of Parallel vwload
In parallel mode, compression and writing data to disk remain as in regular vwload. Reading and parsing input gets parallelized. Also, parallel vwload imposes less communication overhead, because all the processing is performed inside a single server process. Therefore, vwload in parallel mode may be faster even when loading a single file.
In parallel mode vwload does not output all parsing and conversion errors. It outputs only the first error encountered (if any). Therefore, we strongly recommend using the ‑‑log option to be able to see all rows that were rejected during load and corresponding errors. Also, in parallel mode, vwload reports only the number of loaded tuples. It does not report the number of errors (rejected rows) and the total number of processed rows.
The following options cannot be used in parallel mode:
‑‑skip
‑‑frequency
‑‑verbose
The following options behave differently in parallel mode:
‑‑errcount n
In regular mode, first n errors are ignored. In parallel mode the first n errors in each input file are ignored. In particular, with m input files, the maximum number of ignored errors is n*m.
‑‑log path
In regular mode, the path specified is a file. The file is created by vwload. The file will contain rejected rows and corresponding errors.
In parallel mode, the path specified is a directory. The directory is created if it does not exist. If errors are encountered during load, vwload creates two files for each input file. For example, if errors occur while loading file "input1", vwload creates:
path/input1_reject with rows that were not loaded (were rejected)
path/input1_errors with errors that caused those rows not to be loaded