Was this helpful?
Match Analysis Operators
DataFlow includes several prebuilt fuzzy matching operators, which can be used to analyze data through approximate matching. For more information, refer to the following topics:
AnalyzeDuplicateKeys Operator
The AnalyzeDuplicateKeys operator can provide an analysis of the quality of a set of blocking keys over data to be deduplicated. As each record in a given block must be compared to every other record in the block during deduplication, the smaller the block sizes, the better the performance.
The AnalyzeDuplicateKeys operator will output the results from its key analysis to the console.
Code Examples
The following code fragment demonstrates how to initialize the AnalyzeDuplicateKeys operator.
Using the AnalyzeDuplicateKeys operator in Java
AnalyzeDuplicateKeys analyzer = graph.add(new AnalyzeDuplicateKeys());
analyzer.setBlockingKeys(Arrays.asList("keyField1", "keyField2"));

graph.connect(reader.getOutput(), analyzer.getInput());
Using the AnalyzeDuplicateKeys operator in RushScript
var results = dr.analyzeDuplicateKeys(data, {blockingKeys:["keyField1", "keyField2"]});
Properties
The AnalyzeDuplicateKeys operator provides one property.
Name
Type
Description
keys
List<String>
The fields to use for key blocking.
Ports
The AnalyzeDuplicateKeys operator provides a single input port.
Name
Type
Get Method
Description
input
getInput()
The input data to analyze.
AnalyzeLinkKeys Operator
The AnalyzeLinkKeys operator can provide an analysis of the quality of a set of blocking keys over two data sets to be linked. As each record in a given block on the left must be compared to every other record in the same block on the right during linking, the smaller the block sizes, the better the performance.
The AnalyzeLinkKeys operator will output the results of its key analysis to the console.
Code Examples
The following code fragment demonstrates how to initialize the AnalyzeLinkKeys operator.
Using the AnalyzeLinkKeys operator in Java
AnalyzeLinkKeys analyzer = graph.add(new AnalyzeLinkKeys());
analyzer.setLeftBlockingKeys(Arrays.asList("keyField1"));
analyzer.setRightBlockingKeys(Arrays.asList("keyField1"));

graph.connect(leftReader.getOutput(), analyzer.getLeft());
graph.connect(rightReader.getOutput(), analyzer.getRight());
Using the AnalyzeLinkKeys operator in RushScript
var results = dr.analyzeLinkKeys(data1, data2, {leftBlockingKeys:["keyField1"], rightBlockingKeys:["keyField1"]});
Properties
The AnalyzeLinkKeys operator provides the following properties.
Name
Type
Description
leftKeys
List<String>
The fields to use for key blocking for data on the left.
rightKeys
List<String>
The fields to use for key blocking for data on the right.
Ports
The AnalyzeLinkKeys operator provides the following input ports.
Name
Type
Get Method
Description
left
getLeft()
The input data to analyze for the left side.
right
getRight()
The input data to analyze for the right side.
Last modified date: 03/10/2025