1. Output data ordering: Implementations that can make guarantees as to their output ordering may do so by calling RecordPort.setOutputDataOrdering().
2. Output data distribution (only applies to parallelizable operators): Implementations that can make guarantees as to their output distribution may do so by calling RecordPort.setOutputDataDistribution().
Note that both of these properties are optional; if unspecified, performance may suffer since the framework may unnecessarily re-sort or re-distribute the data. See
Managing the Port Metadata for additional information about handling output guarantees.
Input Model Ports
In general, there is nothing special to declare for input model ports. Models are implicitly duplicated to all partitions when going from non-parallel to parallel operators. The case of a model going from a parallel to a non-parallel node is a special case of a "model reducer" operator. In the case of a model reducer, the downstream operator must declare the following:
1. Merge handler: Model reducers must declare a merge handler by calling AbstractModelPort.setMergeHandler().
Note that
MergeModel is a convenient, reusable model reducer, parameterized with a merge-handler.
Output Model Ports
SimpleModelPorts have no associated metadata and therefore there is never any output metadata to declare. On the other hand,
PMMLPorts do have associated metadata. For all
PMMLPorts, implementations must declare the following:
1. pmmlModelSpec: Implementations must declare the PMML model spec by calling PMMLPort.setPMMLModelSpec() and passing a
PMMLModelSpec.
Execution Method
The "work" of an executable operator is performed within the execute() method. The execute() method is provided an
ExecutionContext. The primary purpose of the
ExecutionContext is to provide bindings from logical ports to physical ports. For more information, see Ports section in
Application Model.
In general, the execute method consists of the following steps:
1. Obtain physical ports for each logical port.
Last modified date: 12/09/2024