YARN Required Settings
The following YARN settings are required.
yarn-site.xml
<property>
<name> yarn.resourcemanager.system-metrics-publisher.enabled </name>
<description>
This property indicates to the ResourceManager, as well as to clients,
whether or not the Generic History Service (GHS) is enabled.
If the GHS is enabled, the ResourceManager begins recording
historical data that the GHS can consume, and clients can
redirect to the GHS when applications finish running.
</description>
<value> false </value>
</property>
capacity-scheduler.xml
For Hadoop version 2.4 (and older) the CapacityScheduler out of the box uses the DefaultResourceCalculator, which takes into account only the memory dimension for computing the number of available containers; the amount of virtual cores is ignored. We recommend using DominantResourceCalculator instead.
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
<description>
The ResourceCalculator implementation to be used to compare resources in the
scheduler. The default (i.e., DefaultResourceCalculator) only uses memory
while the DominantResourceCalculator uses dominant-resource to compare
multi-dimensional resources such as memory, CPU etc.
</description>
</property>
Actian Service ID
The installer creates actian as a service ID (system account) to reduce the chances of conflict on one or more of the other nodes in the cluster. Service accounts have uids below 1000, 500 on some systems. This causes YARN to complain.
To stop YARN complaining about the service ID, add the following line to container-executor.cfg under HADOOP_CONF_DIR:
allowed.system.users=actian
Additional YARN Settings when Using Kerberos
To work around YARN-2892 an "actian" (short-name) proxy user must be created. The following YARN configuration snippet allows the user "actian" (which can be the Kerberos principal actian@domain) to impersonate the local user "actian" by its short-name.
core-site.xml
<property>
<name>hadoop.proxyuser.actian.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.actian.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.actian.users</name>
<value>actian</value>
</property>
Also, make sure that YARN is configured to support long running applications. By default, any delegation token has a maximum lifetime of seven days. The following properties allow YARN ResourceManager to request new tokens when the existing ones are past their maximum lifetime. YARN is then able to continue performing localization and log-aggregation on behalf of the hdfs user.
yarn-site.xml
<property>
<name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
<value>true</value>
</property>
core-site.xml
<property>
<name>hadoop.proxyuser.yarn.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.yarn.groups</name>
<value>*</value>
</property>