YARN Configuration Settings
The following YARN configuration settings are relevant for VectorH.
Memory related settings in YARN are as follows:
yarn.nodemanager.resource.memory-mb
Amount of physical memory per NodeManager, in MB, that can be allocated for containers.
yarn.scheduler.minimum-allocation-mb
The minimum allocation for every container request at the ResourceManager, in MB. Memory requests lower than the specified value will not take effect.
yarn.scheduler.maximum-allocation-mb
The maximum allocation for every container request at the ResourceManager, in MB. Memory requests higher than the specified value will not take effect.
CPU related settings in YARN are as follows:
yarn.nodemanager.resource.cpu-vcores
Number of CPU cores per NodeManager that can be allocated for containers.
yarn.scheduler.minimum-allocation-vcores
The minimum allocation for every container request at the ResourceManager, in terms of virtual CPU cores. Requests lower than the specified value will not take effect.
yarn.scheduler.maximum-allocation-vcores
The maximum allocation for every container request at the ResourceManager, in terms of virtual CPU cores. Requests higher than the specified value will not take effect.
We recommend keeping the yarn.scheduler.minimum-allocation-mb constant (1024MB, for example) and setting the yarn.scheduler.maximum-allocation-mb to the same value as yarn.nodemanager.resource.memory-mb. These settings allow applications to acquire memory within the minimum-maximum range. If a YARN application wants to acquire one large container per node, as Vector does, then it can do so. In case of multiple applications (or YARN enabled data frameworks) running in the same cluster, the yarn.scheduler.maximum-allocation-mb property can be adjusted (for example, 40% of yarn.nodemanager.resource.memory-mb). In this case, however, the difference between the (max_memory_size + bufferpool_size) minus yarn.scheduler.minimum-allocation-mb will be out-of-band from YARN and this can increase the resource contention in the cluster. The same applies for the YARN vcores and Vector num_cores configuration options.
Resource Manager Schedulers
yarn.resourcemanager.scheduler.class
1. org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
2. org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.scheduler.fair.assignmultiple
Whether to allow multiple container assignments in one heartbeat. Defaults to false. We recommend true instead.
yarn.scheduler.fair.user-as-default-queue
Whether to use the username associated with the allocation as the default queue name, in the event that a queue name is not specified. If this is set to false or unset, all jobs have a shared default queue named "default". The default is "true". If a queue placement policy is given in the allocations file, this property is ignored.