DataFlow Operator Library
The DataFlow operator library consists of prebuilt operators that come with DataFlow and can be used within the DataFlow API or RushScript. The library is organized into various categories, listed below, with each category covering the associated operators. Each operator page includes details such as usage examples, properties, and port information.
Input Output Operations - The DataFlow operator library includes several pre-built Input/Output operators. For more information, refer to the following topics:
SQL-Like Operations - The DataFlow operator library offers a set of operators for performing SQL or relational operations. Using DataFlow, data can be retrieved from various sources, and joined, filtered, aggregated, or reordered as needed. This flexibility enables working with data using common relational concepts and capabilities, without the limitation of having all data in a single location.
The operators that implement relational functionality are composed and configured in the same way as other operators within DataFlow. Some allow for a more SQL-like syntax in their configuration, providing a smoother transition for users familiar with SQL. For more information, refer to the following topics:
Data Manipulation Operations - The DataFlow operator library includes several operators for manipulating the fields within record ports. For more information, refer to the following topics:
Data Analytics Operations - The Dataflow operator library includes several pre-built analytics operators. For more information, refer to the following topics:
Statistical Operations - The DataFlow operator library includes several pre-built operators for statistics and data summarization. For more information, refer to the following topics:
Support Vector Machine Operations - The DataFlow operator library includes operators for working with support vector machines (SVMs), which are supervised learning models used to analyze data and recognize patterns, often for data classification. For more information, refer to the following topics:
Text Processing Operations - The DataFlow operator library includes several operators for text processing, which involves extracting information from unstructured text. This is achieved by first structuring the text in a format suitable for analysis, followed by applying various transformations and statistical techniques. For more information, refer to the following topics:
Data Cleansing Operations - The DataFlow operator library includes operators designed for data cleansing. For more information, refer to the following topics:
Data Matching Operations - The DataFlow operator library includes several pre-built fuzzy matching operators, which can be used to identify duplicates or establish links between records. For more information, refer to the following topics:
Assertion Operations - The DataFlow operator library includes several operators that can be used for asserting various data conditions. While these are primarily used for testing, they can also be useful for other purposes. For more information, refer to the following topics:
Data Capturing Operations - The DataFlow operator library includes several operators that allow the executing application to access data within the application directly. For more information, refer to the following topics:
Generating Additional Data Operations - The DataFlow operator library includes a set of operators for generating data tokens to be used within a DataFlow application. For more information, refer to the following topics:
Partitioning Data Operations - The DataFlow framework allows operators to specify their data needs through metadata. For more information, refer to the following topics:
Script Processing Operations - The DataFlow operator library includes a set of operators that can be used to process rows using user-defined scripts. For more information, refer to the following topics:
Last modified date: 03/10/2025