Installing DataFlow as a Plugin : Integrating DataFlow with MapR : Hadoop Cluster Installation
 
Share this page                  
Hadoop Cluster Installation
You must install DataFlow on a Hadoop cluster when working with MapR. For installation instructions, see Installing and Configuring DataFlow on a YARN-enabled Hadoop Cluster.
Potential YARN issue
A potential issue may occur if MapR is installed to use /tmp directory for the YARN Node Manager user cache and log directories. A MapR clean-up process may delete directories required by YARN. If these directories are not available, then YARN jobs will fail. If this occurs create the following directory on each worker node of the cluster:
/tmp/hadoop-mapr/nm-local-dir/usercache
Create the usercache directory with user ownership set to the mapr user.
YARN creates a user directory in the usercache directory when executing jobs. For example, if a user "actian" is used for executing a job, then a directory actian is created in /tmp/hadoop-mapr/nm-local-dir/usercache/actian.