Building DataFlow Applications in Java : Composing an Application : Examples
 
Share this page                  
Examples
The following example provides a fully functional DataFlow application in Java. The application uses a random data generator as a data source. The random data is streamed to a LogRows operator that uses logging to store the input data.
Note:  This example is used through the steps of composing a DataFlow application.
Simple DataFlow application in Java
package com.pervasive.datarush.training;
 
import com.pervasive.datarush.graphs.LogicalGraph; import com.pervasive.datarush.graphs.LogicalGraphFact ory;
import com.pervasive.datarush.operators.sink.LogRows; import com.pervasive.datarush.operators.source.Genera teRandom;
import com.pervasive.datarush.types.RecordTokenType; import com.pervasive.datarush.types.TokenTypeConstant
;
 
/**
 * Create a simple application graph consisting of a random row generator and
 * a row logger. Execute the graph with the default engine properties.
 *
*/
public class SimpleApp {
 
    public static void main(String[] args) {
 
        // Create an empty logical graph
LogicalGraph graph = LogicalGraphFactory.newLogicalGraph("SimpleApp ");
 
    // Create a random row generator setting the row count and type wanted
    GenerateRandom generator = graph.add(new GenerateRandom(), "generator");
    generator.setRowCount(1000); RecordTokenType type =
TokenTypeConstant.record( TokenTypeConstant.DOUBLE("dblField"),
TokenTypeConstant.STRING("stringFld"));
    generator.setOutputType(type);
 
    // Create a row logger
    LogRows logger = graph.add(newLogRows(0), "logger");
 
    // Connect the output of the generator to the input of the logger
    graph.connect(generator.getOutput(), logger.getInput());
 
    // Compile and run the graph
    graph.run();
 
    }
 
}
The following is the same example written in JavaScript. For detailed conceptual information about building a DataFlow application, see Building DataFlow Applications Using RushScript.
Simple DataFlow application in JavaScript
// Define the schema wanted from the data generator
var schema = dr.schema().DOUBLE('dblField').STRING('stringF ield');
 
// Create a dataset of randomly generated data var data = dr.generateRandom({rowCount:1000, outputType:schema});
 
// Log the dataset
dr.logRows(data, {logFrequency:0});
 
// Execute the created application dr.execute();
The following provides the output of the application. This application was run on an 8-core system and the LogRows operator logs the type of its input data and the final row count.
INFO com.pervasive.datarush.graphs.internal.Logical Graph
InstanceImpl execute Executing phase 0 graph:
{[generateRandom, logRows]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField {"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Input type is
{"type":"record","representation":"DENSE_BASE_ NULL","fields":[{"dblField":{"type":"double"}}
,{"stringField":{"type":"string"}}]}
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO SimpleApp.logRows execute Counted 125 rows
INFO  com.pervasive.datarush.script.javascript.Dataf lowFactory execute script execution time: 2.83 secs
As the log frequency property of the LogRows operator is set to zero, no row data id written to the log. Eight pairs of logs are written. This is based on the method the application is parallelized. The two operators have no data dependencies and are fully capable of parallel operation. Therefore, the DataFlow compiler replicated the operators eight times with one replication for each available core, and this is the default behavior. The level of parallelization applied is an engine setting and can be modified.
For information about modifying the level of parallelization, see Engine Configuration Settings.