Using DataFlow in KNIME : Creating and Executing DataFlow Workflows : Streaming Mode
 
Share this page                  
Streaming Mode
The streaming mode allows faster execution of workflows. The data is streamed from a node to another node without staging to disk. The workflows can be run in parallel on a single machine or a Hadoop cluster.
This mode is supported only when the DataFlow extension for KNIME is installed and enabled on a workflow. When this mode is enabled for a workflow, it executes in the DataFlow platform.
For more information about enabling streaming mode, see Executing in Streaming Mode.
In streaming mode:
Workflows can be executed on a client machine. When the workflows are executed locally on the client machine, the DataFlow execution platform uses multiple cores on the client system to enhance the run time.
Workflows can be executed on a Hadoop cluster. This supports running workflows on large amounts of data. Running on a Hadoop cluster allows the DataFlow execution platform to use all the resources of the cluster.
Streaming workflows can also be executed on KNIME Server. This reduces the run time by using all the cores available on the server.
Data is streamed between nodes without staging to disk. However, data may be staged at certain points of the workflow.
Workflows cannot be partially executed. However, nodes that support view modes may still execute partially. After executing the workflow completely, the views are populated with data and operate normally.
Executing in Streaming Mode
The streaming mode execution is supported only by the DataFlow Executor in the Actian DataFlow extension for KNIME. The following are the two ways to enable this Executor for a workflow:
Create the workflow as an Actian DataFlow workflow
Enable the DataFlow Executor on an existing workflow