Concepts to Know : Execution Modes : Distributed Mode : Cluster Manager
 
Share this page                  
Cluster Manager
Every DataFlow cluster has a central registry of information about the cluster, used to discover member nodes. A DataFlow daemon called Cluster Manager provides this service. It is contacted during graph compilation to make the nodes available, but afterwards it is not involved in the execution of the graph.
A cluster is uniquely identified by its Cluster Manager host name and port. As long as clients have these settings, usually provided by a properties file, they can access the cluster. For more information about how DataFlow determines cluster configuration, see Setting Up Clusters.
Note:  For YARN-enabled Hadoop clusters, see Installing and Configuring DataFlow on a YARN-enabled Hadoop Cluster.