Security Guide : 9. Vector in Secure Hadoop
 
Share this page                  
Vector in Secure Hadoop
Running VectorH on a Kerberos-enabled Cluster
Installing and running VectorH on a Hadoop cluster configured to use Kerberos for authentication and authorization requires generating and managing Kerberos ticket-granting tickets (TGT).
For installation, a TGT is required on the master node for both the Hadoop administrative user (typically hdfs) and the VectorH administrative user (typically actian) so that HDFS data locations can be created as part of the installation.
For runtime, a dedicated principal and keytab are required to create and renew TGTs for the VectorH administrative user to access HDFS and other Hadoop services (for example, YARN). After the principal and keytab are defined, VectorH renews the TGTs across all nodes without the need for further user interaction.
Kerberos TGT Requirements for Installation
Before the installer installs VectorH, it queries the default Kerberos credential cache for existing tickets for the default principal for both the actian and hdfs user. If the cache is empty or contains an expired ticket for either account, the installation process stops and raises an error.
A specific Kerberos principal and keytab file for the actian user can be passed in a response file using the parameters KRB5_PRINCIPAL and KRB5_KEYTAB_FILE. If passed, the installer uses these values to generate a TGT using the Kerberos kinit command. If the kinit command fails, an appropriate error is raised.
During upgrade, the configured ticket cache or principal and keytab file are used as required.
The installer will prompt you for a dedicated principal and keytab file during HDFS setup. (Or, if set in a response file, the setup will use the response file parameters mentioned previously.)
Note:  If the install.sh script with the ‑express flag is used to install VectorH, a response file must be used to define KRB5_PRINCIPAL and KRB5_KEYTAB_FILE. Launching install.sh without a response file containing these parameters will result in an error.
Kerberos Dialog During Pre-installation
No credentials in default caches:
Kerberos authentication enabled for Hadoop, checking access...
No valid TGT for default principal for the 'actian' user. Run:
    kinit
to generate a TGT for this user (See man kinit for more info)
No valid TGT for default principal for the 'hdfs' user. Run:
    kinit
to generate a TGT for this user (See man kinit for more info)
ERROR: Kerberos authentication failed. The installation cannot proceeed
until this issue has been resolved
 
WARNING: hdfs_root_setup did not complete successfully.
See:
    /opt/Actian/VectorVH/ingres/files/install.log 
for more details.
With credentials in default caches:
Kerberos authentication enabled for Hadoop, checking access...
Default principal for 'actian' user:
    actian@EXAMPLE.COM
Default principal for 'hdfs' user:
    hdfs@EXAMPLE.COM
In order to use HDFS
    Actian Vector in Hadoop x.x.x
requires an HDFS storage location owned by 'actian'
Enter the HDFS location to be used by Vector in Hadoop [/Actian]: 
Kerberos Dialog during HDFS Setup
Setting up Vector in Hadoop HDFS Support...
This Hadoop cluster has been configured with Kerberos authentication
and authorization. In order to continue installation of:
    Vector in Hadoop
a Kerberos principal is required to allow access to Hadoop. e.g.
    actian@EXAMPLE.COM
A headless keytab (not password protected) for the principal is also
required so that tickets can be maintained without user interaction.
Either a single keytab file can be shared between all nodes or individual
keytabs can be created on all nodes. 
If separate keytabs are used for each node, they must be in the
same location
Do you wish to continue? (y/n) [y] 
Enter the Kerberos principal to be used to access Hadoop
[]: badprincipal@NODOMAIN
 
Enter the full path of the keytab file
[]: /tmp/actian.keytab
Kerberos principal:
    badprincipal@NODOMAIN
is not contained in the keytab file:
    /tmp/actian.keytab
Enter the Kerberos principal to be used to access Hadoop
[]: actian@EXAMPLE.COM
Enter the full path of the keytab file
[]:  /tmp/actian.keytab
Sync keytab to all nodes? (y/n) [y] y
Do you want to continue this setup procedure? (y/n) [y] 
Kerberos-related Configuration Parameters
During installation, the following parameters are set in config.dat:
ii.masternode.x100.krb5.root – Root location for VectorH Kerberos files (ticket caches, keytabs etc.). Defaults to $II_SYSTEM/ingres/files/krb5
ii.masternode.x100.krb5.principal – Principal to be used by VectorH to access all Hadoop services
ii.masternode.x100.krb5.keytab – Keytab file to be used to generate/renew TGTs for VectorH principal
ii.masternode.x100.krb5.key_life – Lifetime of tickets generated from keytab
ii.masternode.x100.krb5.key_renewal – Interval after which TGT is renewed. This must be less than ticket lifetime to maintain uninterrupted service.
ii.masternode.x100.krb5.enable – VectorH Kerberos authentication status (true/false). Must be set.
ii.masternode.x100.hdfs.authentication – Authentication being used by the Hadoop cluster
ii.masternode.x100.krb5.krb5conf – Kerberos client configuration file to use (defaults to /etc/krb5.conf)
ii.masternode.x100.krb5.server – Executable used to manage Kerberos tickets (krenew|k5start)
Kerberos keytab File Synchronization
During the HDFS setup portion of the install, you are prompted to “Sync keytab to all nodes?”. By default, the keytab file is copied to $II_SYSTEM/ingres/files/krb5/keytabs and synchronized to all nodes.
You also have the option to use the keytab in its current location and not synchronize the file to all nodes. In this case, a keytab file of the same name must be present on all nodes in the same location. To control this behavior from a response file, use the response file parameter KRB5_KEYTAB_PER_HOST. Setting the parameter to true causes the specified keytab file to be left in place and not synchronized to other nodes. If not set, the default setting of false is used.
Reconfigure Kerberos Settings on All Nodes
The Kerberos setup can be invoked directly.
To reconfigure Kerberos settings
Run the following from the command line:
iisuhdfs krb5
Managing Kerberos TGT at Runtime
After installation, Kerberos access and ticket management is transparent to the user. When an X100 server starts, Vector creates the TGTs and renews them for the life of the server.
The following Kerberos utilities, located in $II_SYSTEM/ingres/bin, are used for managing Kerberos and TGTs.
krenew renews the Kerberos ticket-granting ticket cache for the actian principal.
runkstart command controls the Kerberos krenew daemon on all nodes through startauth.
startauth is a wrapper script for launching krenew server.
runkstart.sh--Control the Kerberos Daemon
The runkstart.sh command starts and stops the Kerberos daemon on all nodes. It has the following format:
runkstart -dbname dbname|-alldb [-kill|-list]
dbname
Specifies the name of the database on which to start or stop the daemon
-alldb
Starts the processes for all existing databases. Valid only for ‑kill and ‑list.
-kill
Stops running processes on all nodes for a specified database
-list
Lists the running processes and existing tickets on all nodes for a specified database
Disable Automatic Kerberos TGT Management
If you want Kerberos TGTs to be managed manually in VectorH, set the following config.dat parameters using the iisetres command before or after installing VectorH:
ii.masternode.x100.krb5.enable: false
ii.masternode.x100.hdfs.authentication: manual (or anything other than krb5)