RushScript Example
A simple RushScript example is shown below. The details of composition and execution are covered in another topic. As an overview, the script takes the following steps:
1. Creates a schema that defines the structure of the data to be read.
2. Creates a file reader and set the properties on the reader.
3. Creates a filter operator for the data that was read. Defines the filter condition as a predicate expression.
4. Handles the data that passes the filter condition by writing it to a local file.
5. Handles the data that fails the filter condition by writing it to another local file.
6. Explicitly executes the composed DataFlow application, providing a name that will be useful for debugging and profiling.
Sample RushScript
// Define ratings schema
var ratingschema = dr.schema()
.nullable(true)
.trimmed(true)
.INT("userID")
.INT("movieID")
.DOUBLE("rating")
.INT("timestamp");
// Read the ratings
var ratings = dr.readDelimitedText({source:'data/ratings.txt', schema:ratingschema, fieldSeparator:"::", header:true});
// Filter based on the given predicate
var results = dr.filterRows(ratings, {predicate:'rating >= 2 and rating <= 4'});
// Write filtered results (rows that passed the filter condition)
dr.writeDelimitedText(results.output, {target:'results/ratings-filter-output.txt', mode:WriteMode.OVERWRITE, header:true, fieldDelimiter:"", writeSingleSink:true});
// Write rejects (rows that failed the filter condition)
dr.writeDelimitedText(results.rejects, {target:'results/ratings-filter-rejects.txt', mode:WriteMode.OVERWRITE, header:true, fieldDelimiter:"", writeSingleSink:true});
// Explicitly execute the application
dr.execute("filter-ratings");
Last modified date: 06/14/2024