Building DataFlow Applications in Java : Composing an Application : Overriding the Operator Parallelism
 
Share this page                  
Overriding the Operator Parallelism
Each operator used the metadata model to determine whether to support parallelism. In the previous example, both the GenerateRandom and LogRows operators support parallelism and are available in the output from the application
The output contains eight sets of LogRows output, and these show the input data type and the number of visible rows. A parallel instance of the LogRows operator output each set.
In a few cases, you can override the fundamental parallelism of an operator to make it non-parallel. To do this, use the disableParallelism() method that is available in all the operator instances. To disable the parallelism on the LogRows operator in the above example, use the following code.
logger.disableParallelism();
When the parallelism is disabled on the LogRows operator, the following is the output from the sample application. Only a single set of logs are sent to the output by the operator to indicate that the operator is executed with parallelism disabled.
INFO
com.pervasive.datarush.graphs.internal.Logical GraphInstanceImpl execute Executing phase 0 graph: {[generateRandom, logRows]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Counted 1000 rows
INFO
com.pervasive.datarush.script.javascript.Dataf lowFactory execute script execution time: 2.55 secs
Warning!  Disabling parallelism of an operator can cause data fan-in or fan-out behavior, and this affects the performance of the application. This occurs particularly when running DataFlow in a distributed environment.