Evaluation Guide
1. Introduction
Actian Vector, Hadoop, and Actian VectorH
Using the Actian Sites
2. Evaluation Process
Evaluation Steps
Functional Test Criteria
Non-functional Test Criteria
Success Criteria
Implementation Design
Disaster Recovery
3. Stage 1--Define and Create the Test Environment
Where to Undertake the Evaluation?
Evaluation Using Actian Vector
Which Hadoop Distribution to Use?
Cluster Environment: Hardware and Software
Installation
Recommended Hadoop Settings
Pre-installation Tasks
Installing with root Access
Which DataNodes to Install VectorH On?
Installing with sudo Access
Installing Without Privileged Access
Installing Using a Response File
How to Install on a Kerberos-enabled Cluster
Automatic Management of Kerberos Details
Resetting the Installation
Verifying the Installation
Post Installation Tasks
Linux Configuration Settings
Configuring Firewall Ports
TCP/IP Routing for Edge Nodes
Configuring to Run as a Service
Configuring Database Resources
Giving the Database User a Password
Add Environment Setup to the Login Script
Using the Correct Ethernet Connection
Enabling Short Circuit Reads
Running Under YARN
Controlling Checkpoint Disk Space
Using VectorH
Stopping and Starting the Instance
Creating and Removing Databases
Stopping and Starting a Database
Users, Groups, Profiles, Permissions
4. Stage 2--Migrate the Schema, Data, and Queries
Migrating a Database Schema to Vector
Partitioning Tables
Loading Data
Generating Statistics
Routine Maintenance
5. Stage 3--Run a Single-user Test
6. Stage 4--Optimize the Database Schema
Performance Considerations
Table Partitioning
Query Statistics
Ordered Data
Indexing
Nulls
Using Foreign Key Specifications
Using Correct Data Types
Query Profiling
7. Stage 5--Run a Multi-user Concurrency Test
How Do I Run a Concurrency Test?
Tuning for Concurrency
Small Data Volumes
How Busy Is My Cluster During the Test Run?
8. Stage 6--Run the Non-functional Tests
Non-functional Success Criteria
Testing Master Node Failure
Testing Slave Node Failure
Using Active Directory Authentication
9. Tools and Troubleshooting
Tools
vwadmin Utility
Troubleshooting and Log Files
Transparent Huge Pages and Defragmentation
Hortonworks and Hadoop Client Installation
Failure to Start or Slow Response
Data Optimization
Database/Instance Will Not Stop Due to Connections
Maintaining Error Log Size (Rotating the Log)
Enabling Disk Use for Query Operations
Ensuring Correct Database Configuration
Optimizer Ran Out of Memory
10. Running the Performance Test Kit
A. References and Further Information
B. Manually Running the DBT-3 Sample Performance Tests
C. Create Table Statements
D. Create Ordered Table Statements
E. Test Queries
Query 1
Query 2
Query 3
Query 4
Query 5
Query 6
Query 7
Query 8
F. Test Scripts
RemoteExec.sh
G. Configuration Checklist
Evaluation Guide
G. Configuration Checklist