Troubleshooting and Log Files
This section lists some common issues that arise when setting up a VectorH installation. See also
Configuration Checklist.
Depending on the nature of the issue, information on the problem is recorded in the VectorH log files. Additional information may also be recorded in the Hadoop and Linux log files. Unless overridden during installation or database creation, the location of the VectorH log files is the files subdirectory under the main installation directory ($II_SYSTEM/ingres/files). Of particular interest are:
errlog.log
The installation wide general error and information log file
vectorwise.log
Log file for the execution engine (x100) process
x100errlog.log
Additional information related to execution engine communication
Note: This file was previously called mpierrlog.log.
Additional log files in this directory contain information about checkpoints (ckpdb.log), configuration changes (config.log), the archiver (iiacp.log) and recovery (iircp.log) processes, and a copy of the installation process output (install.log).
If using YARN the following are also of interest:
vector-agent.log
Log file for the DbAgent utility, which starts the Application Master
vector-wset-appmaster.log
Log file for the Application Master, which manages the resource requirements for x100_server
Transparent Huge Pages and Defragmentation
Like other Java applications, Hadoop performance can suffer if the Linux feature Transparent Huge Pages is enabled (especially during the defragmentation process). Hadoop vendors generally recommend disabling this feature. (Installers will report warnings if the feature is enabled.)
We strongly advise to follow the Hadoop vendor's recommendations to disable Transparent Huge Pages and Transparent Huge Pages Defragmentation. Both features should be disabled; it is not sufficient to disable only Transparent Huge Pages.
To check the current settings, issue a command similar to this on all nodes (the current setting is shown in brackets []):
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
always madvise [never]
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
For instructions specific to your installation, consult your Hadoop and Linux documentation.
Hortonworks and Hadoop Client Installation
If using Hortonworks, some versions of Ambari do not install some of the required JAR files on all DataNodes. Correct operation requires the Hadoop Client and Hadoop DataNode JARs to be installed.
The following command should return details similar to the following:
rpm -aq | grep hadoop | grep "client\|datanode"
hadoop_2_2_6_0_2800-client-2.6.0.2.2.6.0-2800.el6.x86_64
hadoop_2_2_6_0_2800-hdfs-datanode-2.6.0.2.2.6.0-2800.el6.x86_64
If these JARs are not installed on your installation by default, install them manually.
Failure to Start or Slow Response
If improper shutdown leaves an orphaned process on one or more of the DataNodes, then restart and operation of the installation can be affected. Check that there are no orphaned copies of Actian software (x100_server, pmi_proxy) running on any of the slave nodes.
Data Optimization
Make sure statistics are available for tables used in queries, especially for the columns used in restrictions or joins between tables. The timestamp of the last generated statistics is available in the iistats database view (select * from iistats).
When the distribution of data values in the tables and columns that the statistics report on has changed significantly from when they were last gathered, you may need to regenerate statistics. The change in distribution might happen if a large data load has taken place and the new data has a different value distribution to the values that were already in the table. This can happen easily with timestamped data, for example, since timestamps for new data will by definition be of a different distribution than that of older data.
Histograms are automatically generated only for those columns that do not already have histograms stored in the catalog.
For more information, see
Generating Statistics.
Database/Instance Will Not Stop Due to Connections
If when stopping the instance or database you receive messages about open connections preventing the shutdown and you want to force a shutdown, then use the ingstop command with the -f and -k flags. Try first with -f and if that still does not stop, then add -f -k.
Doing this will require recovery and the next startup will take longer.
Maintaining Error Log Size (Rotating the Log)
The X100 log file, vectorwise.log, may consume significant disk space. Periodically use VWLOG_ROTATE (or use the
vwadmin Utility) to manage the size of the error log as described in the
SQL Language Guide.Enabling Disk Use for Query Operations
By default VectorH rejects queries that exhaust all available work memory. If you are unable to increase the memory available to such queries, it is possible to enable disk for use as temporary query space, at the cost of slowing down execution for those queries that need it. See the “spilling to disk” options described in the User Guide.
Ensuring Correct Database Configuration
The database configuration is synchronized between the nodes as part of startup. Ensure, particularly if upgrading from an earlier version, that any old configuration files have been removed. The vwinfo utility can be used with the ‑c flag to display the configuration of nodes on the cluster and compare them to expected values (for example, vwinfo –c pocdb).
Optimizer Ran Out of Memory
For very complex SQL, the resources available to the optimizer can be insufficient for optimizing the query, especially if many such queries are submitted at once (which is common in a testing or benchmark environment).
If you encounter the error “E_OP0002 optimizer ran out of memory before generating execution plan,” use the Configuration By Forms (CBF) utility to increase opf_memory.
Note: opf_memory is by default a derived value. When setting explicitly, ensure that it is protected to prevent it being overridden upon restart. For details on the CBF utility, see the System Administrator Guide.
The formulas for how derived values are calculated are contained in the Configuration Rules files (for example, dbms.crs contains details of how opf_memory is calculated).
There are other configuration parameters that can affect optimizer memory and other related out of memory errors, but discussing them is outside the scope of this document. If you need assistance, contact Actian Support.