20. Migrating from Vector to VectorH : Migrate a Database from Vector to VectorH
 
Share this page                  
Migrate a Database from Vector to VectorH
The following steps show examples of commands to use to migrate your data.
To migrate data from Vector to VectorH
1. Copy the data from Vector using the copydb command:
copydb olddb -c -with_csv -no_loc
The flags "-c" and "-with_csv" create vwload-compatible files.
The –no_loc flag will not write the WITH LOCATION clause for CREATE TABLE statements. This prevents the tables from being created in II_DATABASE because they need to be created in the II_HDFSDATA location.
Two command files are created: copy.in and copy.out.
2. Execute the copy.out script:
sql olddb <copy.out
All tables and views owned by the user are copied into files in the specified directory.
3. Edit the copy.in script as follows:
a. Add the following clause to CREATE TABLE statements:
WITH PARTITION=(HASH ON key x PARTITIONS)
where key is the partition key and x is the number of partitions. For details on the WITH PARTITION clause, see the SQL Language Guide.
b. If you will use the vwload utility instead of COPY to load the data, remove the COPY statements. This allows the script to only create the tables and not load the data.
4. Create the new database on the VectorH master node:
createdb newdb
5. Run the copy.in script on the VectorH master node:
sql newdb <copy.in
The tables are created.
Data is loaded unless you removed the COPY statements in copy.in, in which case proceed to the next step to load the data using vwload.
6. Run vwload on the VectorH master node, assuming data files have been copied to the target machine.
Note:  You can use Actian Director to load data remotely from a client machine. No Vector installation is needed on the client.
If the data file is on the local file system of the master node, use regular vwload (do not use the -c flag).
If the data is in multiple files in HDFS, use parallel (cluster) vwload (use the -c flag). Use the --stats flag, which builds histograms for all columns of the loaded table. For example:
vwload --table tablename –-stats --fdelim "," --cluster dbname
hdfs://namenode:8020/path/to/data/table1.txt
hdfs://namenode:8020/path/to/data/table2.txt
hdfs://namenode:8020/path/to/data/table3.txt
7. Optimize the database:
optimizedb -zns
Note:  This step is not necessary if you used the --stats flag on vwload in the previous step.