Match Analysis Operators
DataFlow includes several prebuilt fuzzy matching operators, which can be used to analyze data through approximate matching. For more information, refer to the following topics:
AnalyzeDuplicateKeys Operator
The
AnalyzeDuplicateKeys operator can provide an analysis of the quality of a set of blocking keys over data to be deduplicated. As each record in a given block must be compared to every other record in the block during deduplication, the smaller the block sizes, the better the performance.
The
AnalyzeDuplicateKeys operator will output the results from its key analysis to the console.
Code Examples
The following code fragment demonstrates how to initialize the
AnalyzeDuplicateKeys operator.
Using the AnalyzeDuplicateKeys operator in Java
AnalyzeDuplicateKeys analyzer = graph.add(new AnalyzeDuplicateKeys());
analyzer.setBlockingKeys(Arrays.asList("keyField1", "keyField2"));
graph.connect(reader.getOutput(), analyzer.getInput());
Using the AnalyzeDuplicateKeys operator in RushScript
var results = dr.analyzeDuplicateKeys(data, {blockingKeys:["keyField1", "keyField2"]});
Properties
The
AnalyzeDuplicateKeys operator provides one property.
Ports
The
AnalyzeDuplicateKeys operator provides a single input port.
AnalyzeLinkKeys Operator
The
AnalyzeLinkKeys operator can provide an analysis of the quality of a set of blocking keys over two data sets to be linked. As each record in a given block on the left must be compared to every other record in the same block on the right during linking, the smaller the block sizes, the better the performance.
The
AnalyzeLinkKeys operator will output the results of its key analysis to the console.
Code Examples
The following code fragment demonstrates how to initialize the
AnalyzeLinkKeys operator.
Using the AnalyzeLinkKeys operator in Java
AnalyzeLinkKeys analyzer = graph.add(new AnalyzeLinkKeys());
analyzer.setLeftBlockingKeys(Arrays.asList("keyField1"));
analyzer.setRightBlockingKeys(Arrays.asList("keyField1"));
graph.connect(leftReader.getOutput(), analyzer.getLeft());
graph.connect(rightReader.getOutput(), analyzer.getRight());
Using the AnalyzeLinkKeys operator in RushScript
var results = dr.analyzeLinkKeys(data1, data2, {leftBlockingKeys:["keyField1"], rightBlockingKeys:["keyField1"]});
Properties
The
AnalyzeLinkKeys operator provides the following properties.
Ports
The
AnalyzeLinkKeys operator provides the following input ports.
Last modified date: 03/10/2025