High Availability Support
Using PSQL in High Availability Environments
This chapter includes the following sections:
Overview of Technologies
PSQL is compatible with numerous solutions that maximize uptime in physical and virtual environments. Such solutions continually evolve but can be classified generally as high availability, fault tolerance, and disaster recovery.
High Availability
The definition of high availability can differ depending on the software vendor that provides high availability solutions. In general, it refers to a systems design approach for a predictable baseline level of uptime, despite hardware failure, software failure, or required maintenance.
A common approach to ensure high availability in a physical environment is failover clustering. A common approach in a virtual machine (VM) environment is migration.
Failover Clustering
PSQL is designed to function as a resource in a failover cluster environment in which only one server node at a time accesses the shared storage subsystem. If the primary node fails, a failover (or switch) to a secondary node occurs. Failover clustering allows a system to remain available while you perform software upgrades or hardware maintenance.
PSQL is compatible with Microsoft Failover Cluster Services and with Linux Heartbeat. Refer to the documentation from those vendors for the specific manner in which they define and implement failover clustering. PSQL Server and PSQL Vx Server are the recommended editions for failover clustering.
Migration
In general terms, migration allows a running VM or application to be moved between different physical machines without disconnecting the client or application. The memory, storage, and network connectivity of the VM are typically migrated to the destination.
PSQL is compatible with the migration capability offered by Microsoft Hyper-V, VMware vSphere, and Citrix XenServer. As long as host names remain the same after the VMs are moved, PSQL continues to operate normally. Refer to the documentation from those vendors for the specific manner in which they define and implement migration.
Fault Tolerance
While high availability aims for a predictable baseline level of uptime, fault tolerance is the uninterrupted operation of a system even after the failure of a component. Fault tolerance requires synchronized shared storage. In virtualized environments, the VM that fails must be on a different physical host from the VM that replaces it.
Fault tolerance can be achieved using just physical machines. However, virtual environments lend themselves so readily to maintaining virtual servers in lockstep with each other that exclusively physical environments are increasingly less common. PSQL Server is compatible with fault tolerance capabilities in an exclusively physical environment.
For virtual environments, PSQL is compatible with the fault tolerance capability offered by VMware vSphere and Citrix XenServer. Refer to the documentation from those vendors for the specific manner in which they define and implement fault tolerance.
Disaster Recovery
Disaster recovery involves duplicating computer operations after a catastrophe occurs and typically includes routine off-site data backup as well as a procedure for activating vital information systems in a new location.
PSQL is compatible with major hypervisors that support disaster recovery technology that initializes backup physical or virtual machines. As long as all host names remain the same after the VMs are moved, PSQL continues to operate normally. This allows rapid server replacement and recovery time.
Refer to the documentation from the hypervisor vendors for the specific manner in which they define and implement disaster recovery.
Hardware Requirements
For all of the technologies mentioned in this section, we recommend that you select servers, disk subsystems, and network components from the hardware compatibility list provided by the vendor. We follow this same practice when testing for compatibility with vendor products.
Failover Clustering
Failover clustering provides for multiple physical servers (nodes) to access a common, shared storage subsystem that contains one or more file shares or volumes. The failover services ensure that only one server controls the file shares or volumes at a time. Control of the shared storage subsystem is passed automatically from a failed server to the next active server in the cluster.
The PSQL engine must be licensed separately on each cluster node where you install it, whether the node is a physical or a virtual machine. See also
License Models in
PSQL User's Guide. The failover clustering being discussed refers to Microsoft Failover Clustering or Cluster Service and to Linux Heartbeat. This is distinct from other high availability solutions that may incorporate VM-based clusters.
This section contain the following topics:
Microsoft Failover Clustering for Windows Server
This topic discusses adding the PSQL services to Failover Clustering and assumes the following:
•You know how to install and configure Failover Clustering and need only the information required to add and manage the PSQL services.
•You are familiar with using PSQL and its primary utilities such as PSQL Control Center (PCC).
•You can set up DSNs using ODBC Administrator.
Preliminary Requirements
You must assure that failover clustering functions correctly before you add PSQL services. For example, verify that the failover completes and that all resources are available. Complete the action again and fail over back to the original node. See the Microsoft documentation for how to set up Failover Clustering, verify that it is working correctly, and perform tasks through its interface.
Recommended Installation Process
The following table explains the recommended process to add PSQL to Cluster Services on Windows Server.
Table 27 Adding PSQL to Cluster Services on Windows Server
Action | Discussion |
Install PSQL on the cluster nodes. | Install PSQL Server on each cluster node and choose identical options for each installations. Do not install PSQL on the cluster shared storage where the PSQL data resides. After installation, the PSQL transactional service and the relational service are both set to start automatically when the operating system starts. Change the startup type to manual. |
Add a role and select the PSQL Transactional service as a generic service. Add a file share for your data. Add a resource and select the PSQL relational services as a generic service. | The transactional service of PSQL is always required as a cluster resource. The relational service is optional and is needed only if your application uses the Relational Engine. Note: Because of the dependencies, bring the PSQL resources online in the following order: first the PSQL Transactional Engine then the PSQL Relational Engine (and stop them in reverse order). Registry replication is not supported for PSQL Services. The PSQL database engine must be manually configured on all nodes as explained in
Configure database engine properties with PCC.Set the File Server as a dependency for the PSQL Transaction service, and set the PSQL Transactional service as a dependency to the PSQL Relational service. In each instance, select the option Use network name for computer name. |
Ensure the shared storage has the necessary files and directories. | The PSQL transactional and relational services typically run under the LocalSystem account. Ensure that the LocalSystem account has permissions to read and write to the shared disk. Copy the DBNAMES.CFG file from the ProgramData directory on the active node where you installed PSQL to a directory of your choice in the shared storage. Copy the following directories from ProgramData on the same node to the same directory in shared storage. You can copy them to the same directory as DBNAMES.CFG, but that is optional. •defaultdb •Demodata •tempdb •Transaction Logs |
Configure database engine properties with PCC. | You configure the database engine with PSQL Control Center (PCC) to add certain configuration settings to the registry. Configure the engine on the current active node in your cluster, then move the cluster group, making the configuration changes to each of the nodes in your cluster. In PCC, set the following engine properties for Directories. When PCC prompts you to restart the services, select No. •For Transaction Log Directory, specify the location on the shared disk where you copied the Transaction Logs directory. •For DBNames Configuration Location, specify the location on the shared disk where you copied the DBNAMES.CFG file. In the Failover Cluster Management console, take PSQL resources offline and back online to apply the changes. In PCC, under the Databases node for your server, set the following database engine properties: •In the DEFAULTDB properties under Directories, set Dictionary Location to the location on the shared disk where you copied the DEFAULTDB directory. Under Data Directories, replace the default entry with this same location. •In the DEMODATA properties under Directories, set Dictionary Location to the location on the shared disk where you copied the DEMODATA directory. Under Data Directories, replace the default entry with this same location. •In the TEMPDB properties under Directories, set Dictionary Location to the location on the shared disk where you copied the TEMPDB directory. Under Data Directories, replace the default entry with this same location. |
Your PSQL failover cluster is now configured.
Note If you need to apply a patch to PSQL servers in a cluster environment, see the Actian Technical Support
knowledge base article on this topic.
Linux Heartbeat
The Heartbeat program is one of the core components of the Linux-HA (High-Availability Linux) project. Heartbeat runs on all Linux platforms and performs death-of-node detection, communications and cluster management in one process.
This topic discusses adding the PSQL services to Linux Heartbeat and assumes the following:
•You know how to install and configure the Heartbeat program and need only the information required to add PSQL to a Cluster Service group.
•You are familiar with using PSQL and its primary utilities such as PSQL Control Center (PCC).
Preliminary Requirements
It is essential that Linux Heartbeat be functioning correctly before you add PSQL to the cluster. See the documentation from the High Availability Linux Project (www.linux-ha.org) for how to install Heartbeat, verify that it is working correctly, and perform tasks with it.
Just as you would for any application, set up the essential clustering components before you add PSQL.
Recommended Installation Process
The following table explains the recommended process to add PSQL to Linux Heartbeat.
Table 28 Adding PSQL to Linux Heartbeat
Action | Discussion |
Install PSQL on the Cluster Nodes | Install PSQL Server on each cluster node and choose identical options for each installations. Do not install PSQL on the cluster shared storage, where the PSQL database(s) resides. After installation, the database engine is set to start automatically when the operating system starts. With clustering, however, Linux Heartbeat controls starting and stopping the database engine. The controlling node in the cluster starts the engine, the other nodes do not. After you install PSQL Server, ensure that the Group IDs for “pvsw” and “pvsw-adm” and the UID for “psql” match on all nodes. If required, change the IDs to ensure they are the same. |
Configure the Shared Storage | The shared storage is where is the PSQL database resides. Shared storage for Heartbeat can be implemented many different ways. The multitude of possible implementations is beyond the scope of this document. This section assumes that an NFS mount is being used. Create (or at least identify) a location on shared storage where you want the database to reside. The location is your choice. Ensure that user psql has read, write, and execute authority for the location. Create two groups and a user on the shared storage to ensure that each cluster node can access the database files. •Groups pvsw and pvsw-adm must match pvsw Group ID and pvsw-adm Group ID, respectively, on the cluster nodes. •User psql must match psql UID on the cluster nodes. |
Create the Directory for the Shared Storage Mount | On each cluster node, log in as user psql then create a directory that will be mounted to the shared storage. (User psql has no password and can only be accessed through the root account with the su command.) The name of the directory is your choice. |
Configure Heartbeat Server | Configure the Heartbeat server on each of the nodes that will control the PSQL database engine. Configure the following: •Nodes. Add all nodes that you want in the cluster. •Authentication. Specify the type of authentication to use for the network communication between the nodes. •Media. Specify the method Heartbeat uses for internal communication between nodes. •Startup. Specify the setting for when the Heartbeat Server starts. Set this to on, which means that the “server starts now and when booting.” |
Assign Password for Heartbeat User | Linux Heartbeat provides a default user named hacluster for logging in to the Heartbeat Management Client. Assign a password to user hacluster on each of the nodes from which you want to run Heartbeat Management Client. |
Add a Resource Group for PSQL | Log in as root and start the Heartbeat Management Client on one of the cluster nodes. Log in as user hacluster and add a new group. For ID, specify a name for the PSQL group. Set Ordered and Collocated to true. |
Add the Resources to the Group | Add three resources to the PSQL group: •IPaddr •Filesystem •Psql (OCF resource agent) IPaddr In the Heartbeat Management Client, add a new native item. For Belong to group, select the group you added for PSQL. For Type, select IPaddr. On the resource you just added, specify the IP address of the cluster for the IP Value. Use the IP address assigned to the cluster (not the node) when Linux Heartbeat was installed and configured. Filesystem Add another new native item. For Belong to group, select the group you added for PSQL. For Type, select Filesystem and delete the parameter fstype, which is not required. Add a new parameter and select “device” for Name. For Value, specify the device name of the shared storage, a colon, and the share mount location. Add another new parameter and select “directory” for Name. For Value, specify the directory to use with the NFS mount. Psql (OCF resource agent) Add another new native item. For Belong to group, select the group you added for PSQL. For Type, click “psql” with a Description of “PSQL OCF Resource Agent.” No additional settings are required. |
Create the Subdirectories on the Mounted Shared Storage | Now that you have added the Filesystem resource, the mount exists between the cluster server and the shared storage. On one of the cluster nodes, log in as user psql. Under the shared storage mount, create a directory named “log” and another named “etc.” For example, if the mount directory is /usr/local/psql/shared, you would add directories /usr/local/psql/shared/log and /usr/local/psql/shared/etc. |
Configure the Cluster Server in PCC | On each of the cluster nodes, you need to configure the cluster server with PSQL Control Center (PCC). Place all cluster nodes into standby mode except for the one from which you will run PCC. As user psql, start PCC on the one active node or from a client that can access the active node. In PSQL Explorer, add a new server and specify the name (or IP address) of the cluster. Access the properties for the server you just added. If prompted to log in, log in as user admin. Leave the password blank. Access the Directories Properties. For Transaction Log Directory, specify the directory that you created for the “log” location. For DBNames Configuration Location, specify the directory that you created for the “etc” location. See
Create the Subdirectories on the Mounted Shared Storage. Use PCC to add a new server and set its properties from each of the other cluster nodes. Place all nodes into standby mode except for the one from which you run PCC. |
Create the Database on the Shared Storage | From the operating system on one of the cluster nodes, log on as user psql and create the directory under the file system share where you want the database to reside. (If you create the directory as user root, ensure that user psql has read, write, and execute authority on the directory.) Place all cluster nodes into standby mode except for the one from which you will run PCC. As user psql, start PCC on the one active node or from a client that can access the active node. Create a new database for the server you added in
Configure the Cluster Server in PCC. For Location, specify the directory you created where you want the database to reside. Specify the other database options as desired. For the new database, create tables as desired. |
Verify Access to the Database from each Node | Each cluster node must be able to access the PSQL database on the shared storage. Place the cluster node from which you created the database into standby mode. This is the node running the “psql” resource (the database engine). Fail over to the next node in the cluster. Verify that the next node receives control of running the “psql” resource. Repeat the standby, fail over and verification process for each node in the cluster until you return to the node from which you began. |
Managing PSQL in a Cluster Environment
After you install PSQL in a failover cluster environment, you can manage it as a resource. The following items discuss common management topics:
PSQL Licensing and Node Maintenance
The normal PSQL licensing and machine maintenance procedures also apply to the nodes in a failover cluster environment. Deauthorize the PSQL key before you modify the configuration of the physical or virtual machine where the database engine is installed. Reauthorize the key after changes are complete.
PSQL Failure Behavior
If a cluster node fails, a PSQL client does not automatically reconnect to the PSQL engine on the surviving node. Your application must reconnect the client to the PSQL database or you must restart the application. This applies even if Enable Auto Reconnect is turned on for the database engine.
If transaction durability is turned off and a failure occurs before a transaction completes, the transaction is automatically rolled back to its state before the transaction began. That is, to the last completed check point. The rollback occurs when the active server requests access to the data file.
If transaction durability was turned on, completed changes can be recovered that occurred between the time of the cluster node failure and the last check point. Transaction durability must be configured the same way on all nodes and the transaction log located on the shared storage. Transactions that had not completed at the time of the cluster failure, however, are lost even if transaction durability was in effect.
Stopping or Restarting the PSQL Transactional Service
A cluster failover occurs from the active node if you manually stop the PSQL transactional service through the operating system. If you are performing service node maintenance and want to avoid such a failover, stop the PSQL transactional service through the cluster utilities.
PSQL Configuration Changes
Some configuration changes require that you restart the database engine. See
Configuration Reference.
►To stop and restart PSQL services to apply configuration changes
Use the following steps in the Windows Cluster Administrator. You must do them in the order listed.
1 Right-click Actian PSQL Relational Engine and select Bring this resource offline.
2 Right-click Actian PSQL Transactional Engine and select Bring this resource offline.
3 Right-click Actian PSQL Transactional Engine and select Bring this resource online.
4 Right-click Actian PSQL Relational Engine and select Bring this resource online.
Software Patches
At some point, you may need to patch PSQL or the failover cluster software. To help you do so, Actian Technical Support provides a
knowledge base article on this topic.
Migration
Migration moves a VM running PSQL from one physical host to another. The memory, storage, and network connectivity of the VM are typically migrated to the destination. Depending on the hypervisor, migration is sometimes referred to as “live” migration or “hot” migration.
With a “live” or “hot” migration, client connections to PSQL remain intact. This allows changes to hardware or resource balancing. With a “cold” migration, network connectivity is interrupted because the VM must boot. Client connections to PSQL must be reestablished.
A migration environment has only one instance of PSQL running, which makes the environment somewhat vulnerable if the host machines crashes or must be quickly taken offline. Also, if the shared storage fails, the database engine cannot process reads from or writes to physical storage. Some hypervisors offer a migration solution that does not use shared storage.
As long as host names remain the same after the VM migrates, PSQL continues to operate normally. The product key remains in the active state.
No special steps are required to install or configure PSQL in a migration environment. Refer to the hypervisor documentation.
Fault Tolerance
A fault tolerant environment is similar to a migration environment but includes additional features to ensure uninterrupted operation even after the failure of a component. A fault tolerant environment ensures network connections, continuous service, and data access through synchronized shared storage. If a component switch occurs, client machines and applications continue to function normally with no database engine interruption.
No special steps are required to install or configure PSQL in a fault tolerant environment. Refer to the hypervisor documentation.
Disaster Recovery
Disaster recovery includes data recovery and site recovery. Data recovery is how you protect and restore your data. Site recovery is how you protect and restore your entire site, including your data.
Data recovery is facilitated with the hypervisor shared storage and PSQL transaction logging and transaction durability. See
Transaction Logging and Durability. You can use transaction logging and transaction durability with PSQL Server and Vx Server.
Site recovery can be accomplished with both physical machines and virtual machines. PSQL operates normally provided that host names remain the same in the recovered site. This is typically the case for virtual machines. If you are recovering physical machines and the host name at the recovery site is different, the PSQL product keys will change to the failed validation state when PSQL starts. PSQL will continue to operate normally in the failed validation state for several days, during which you can either repair the key or move back to the original site.
No special steps are required to install or configure PSQL in a disaster recovery environment. Refer to the hypervisor documentation.