Installing and Configuring DataFlow
System Requirements and Licensing
System Requirements
Operating Systems and Hardware
Hardware Performance
Java Version
Hadoop Versions and Distributions
Hive Version
Permissions and Privileges
Additional Tools (Optional)
Applying Licenses
Downloading DataFlow
Installing DataFlow as a Plugin
Installing and Configuring DataFlow on a YARN-enabled Hadoop Cluster
Before You Begin
Clusters on Rackspace or AWS
Clusters on AWS or Rackspace with Vector on Hadoop
Installing DataFlow for Hadoop
Configuring Hadoop for DataFlow
Managing DataFlow Cluster Manager
Starting the DataFlow Cluster Manager
Stopping the DataFlow Cluster Manager
Configuring DataFlow Settings within Cluster Manager
Cluster Administration
Container Settings Overview
Configuring DataFlow Client for Hadoop
Configuring the Hadoop Version
Configuring HDFS Access
Configuring Kerberos for Hadoop
Configuring Kerberos for Hadoop on Windows
Creating Cluster-enabled Execution Profiles
Accessing HDFS from High Availability NameNode
Configuring Hadoop to Read or Write ORC File Format
Integrating DataFlow with Various Hadoop Distributions
Integrating DataFlow with Cloudera
Integrating DataFlow with HortonWorks HDP Platform
Integrating DataFlow with Apache Hive
Hadoop Cluster Installation
Installing DataFlow on KNIME
Downloading and Installing KNIME
Installing DataFlow Extensions in KNIME
Verifying the Installation
Set Preferences for Your Environment
Configuring DataFlow in KNIME
Protecting Passwords
Managing Execution Profiles
Configuring JDBC Database Drivers
Enabling Execution of KNIME Nodes in a Cluster
Enabling Access to Avalanche Database Instances
Enabling Access to Vector Database Instances
Setting Preferences in the Environment
DataFlow Preferences in KNIME
Actian Profiles
Profiles
Details
Managing Actian Profiles
Actian Vector Connections
Actian PSQL Connections
Managing Actian PSQL Database Connections
Cluster Plugin Preferences
HBase Clusters
HBase Cluster Properties
Managing JDBC Connections
Module Preferences
Remote File Systems
KNIME Preferences for DataFlow
Database Drivers
Master Key
Installing and Configuring the DataFlow Plugin to Eclipse IDE
Overview of Eclipse and Eclipse Plug-in
Supported Versions
Installing the DataFlow Plug-in
Installing DataFlow for Use with Java
Before You Begin
Installing the DataFlow Distribution
Installing from a .zip File
Installing from a Linux Repository
Adding the License Key
Configuring DataFlow
Setting DataFlow Environment Variables
Integrating DataFlow with Hadoop
Configuring Java for DataFlow
JVM Location
JVM Options
Configuring Java logging
Verifying the DataFlow Installation
Configuring Third-party Modules
Installing DataFlow on KNIME SDK
Overview of the KNIME SDK Installation Process
Downloading the KNIME SDK
Installing the KNIME SDK
Installing the DataFlow Extensions
Starting Eclipse
Creating a DataFlow Update Site and Installing Extensions
Restarting Eclipse
Upgrading DataFlow and Interfaces
Upgrading DataFlow
Upgrading KNIME
Uninstalling DataFlow and Interfaces
Installing and Configuring DataFlow
Uninstalling DataFlow and Interfaces