Applications in DataFlow

As noted in Introduction to DataFlow, applications written in DataFlow are expressed as dataflow graphs. Constructing an application as a graph is called composing and results in a logical graph. The nodes of this graph are logical operators and the edges are connections between logical ports of the nodes.

As a logical graph, it describes the processing which should be done, but it does not strictly determine the manner in which it must be accomplished. This is analogous to SQL where a statement describes what needs to be computed, but the database is free to arrive at the result in any way it chooses.

When executed, DataFlow will use one or more dataflow graphs—called physical graphs—to produce the requested result.

Note the distinction here: logical entities describe what to do, whereas physical entities perform the actual work. This distinction appears throughout the graph model and gives the DataFlow engine flexibility in determining the best way to perform the requested task. It also has the advantage of separating the users from the details of parallelism, yielding an abstraction which is easier to use. If users are only constructing graphs, they will only deal with simple logical entities: logical graphs, logical operators, and logical ports. More advanced usage will require working with the more complex physical entities as well.