Customizing and Extending DataFlow Functionality : Writing an Operator : General Requirements and Best Practices
 
Share this page                  
General Requirements and Best Practices
Structure of an Operator
All operators are expected to adhere to the "Java Bean" standard. All operators must declare their input and output ports as private final fields and must provide public "getters" to access those fields.
In addition, operators will typically define several configuration properties. Each of those properties should be defined as non-final fields with public "getters" and "setters."
Finally, operators should define a public no-arg constructor.
For example:
package org.foo;

public class MyOperator extends ExecutableOperator {

    //create an input record port--should be declared as final
    private final RecordPort input = newRecordInput("input");

    //create an output record port--should be declared as final
    private final RecordPort output = newRecordOutput("output");
       
    //declare a config property with a default value, must be non-final
    private int myConfigProperty = 5;
        
    //required: default constructor: initializes everything to their defaults
    public MyOperator() {
    }
        
    //optional: one or more convenience constructors to specify values for configuration properties
    public MyOperator(int myConfigProperty) {
        setMyConfigProperty(myConfigProperty);
    }
        
    //required: public getter for each input port
    public RecordPort getInput() {
        return input;
    }
        
    //required: public getter for each output port
    public RecordPort getOutput() {
        return output;
    }
        
    //required: public getter for each configuration property
    public int getMyConfigProperty() {
        return myConfigProperty;
    }
        
    //required: public setter for each configuration property
    public void setMyConfigProperty(int myConfigProperty) {
        this.myConfigProperty = myConfigProperty;
    }
}
Serialization
All logical operators must be JSON serializable. JSON is used as the serialization mechanism anytime we must serialize an operator. This includes serialization to the cluster for distribution and export from the UI. In addition, there are times when we must clone an ExecutableOperator; here we use JSON as the default implementation of clone.
By following the "Java Bean" standard, described in the previous section, JSON serialization/deserialization is mostly automatic:
Serialization proceeds by invoking each of the public getters in turn.
Deserialization proceeds by first creating a new object through the default constructor and then invoking each of the public setters in turn.
Without any additional annotations, the example above is serializable. Its serial form will be as follows:
{
  "@type" : "org.foo.MyOperator",
  "myConfigProperty" : 5
}
To control JSON serialization, various "@Json..." annotations may be used. See Customizing Operator Serialization for more information.
Customizing Operator Serialization
To control JSON serialization, various "@Json..." annotations may be used. The following sections list various use cases for customizing serialization.
Customizing Property Names
By default, property names are given the java bean property name of the corresponding getter or setter. The property name may be modified by using the "@JsonProperty" annotation as follows:
package org.foo;

public class MyOperator extends ExecutableOperator {
    ...
    //This property will be written as "myConfProp" rather than as the default "myConfigProperty"
    @JsonProperty("myConfProp")
    public int getMyConfigProperty() {
        return myConfigProperty;
    }
 
    //This property will be read as "myConfProp" rather than as the default "myConfigProperty"
    @JsonProperty("myConfProp")
    public void setMyConfigProperty(int myConfigProperty) {
        this.myConfigProperty = myConfigProperty;
    }
}
The above operator will serialize as follows:
{
  "@type" : "org.foo.MyOperator",
  "myConfProperty" : 5
}
Customizing Operator Type Names
By default, operators are identified by fully qualified java class names. This has the advantage of not requiring a registry. However, it is a best practice for top-level, public operators to define a shortcut name for the operator type name by declaring the "@JsonTypeName" annotation on the operator. Operators declared with shortcut type names must also be registered through the TypeResolutionProvider service.
For example, to give MyOperator a custom type name "myOp", you would annotate as follows:
package org.foo;

//give MyOperator a custom JSON type name rather than the default "org.foo.MyOperator"
@JsonTypeName("myOp")
public class MyOperator extends ExecutableOperator {
   ...
}
Then register your operator class in a TypeResolutionProvider as follows:
package org.foo;

import com.pervasive.dataflow.json.SimpleTypeResolutionProvider;

public class MyTypeResolutionProvider extends SimpleTypeResolutionProvider {
   public MyTypeResolutionProvider() {
         ...
      register(org.foo.MyOperator.class);
   }
}
Then, register your TypeResolutionProvider. This step is typically done once per jar (multiple operators can be declared within the same TypeResolutionProvider). To register your TypeResolutionProvider, create a resource "META-INF/services/com.pervasive.dataflow.json.TypeResolutionProvider" containing the following:
org.foo.MyTypeResolutionProvider
The above operator will serialize as follows:
{
  "@type" : "myOp",
  ...
}
Ignoring Properties
By default, all public "getters" are serialized. To statically ignore certain properties, the "@JsonIgnore" tag may be used. This is commonly used in the case of a derived property, which is a property that is derived by combining other properties but which should not be reflected in the serialized state. For example:
//derived property: do not serialize
@JsonIgnore
public int getMyDerivedProperty() {
    return getProp1() + getProp2();
}
In addition, you may use the "@JsonSerialize" annotation to dynamically exclude certain properties based on their values. For example:
//property "value" will only be serialized if non-null
@JsonSerialize(include=Inclusion.NON_NULL)
public String getValue() {
    return this.value;
}
Overloading Property Setters
When defining getters and setters, it is simplest to avoid overloading. The Java language will prevent you from overloading getters; however, there are cases where we overload setters to provide convenience signatures. If you do overload setters, you must explicitly annotate those signatures that must be used and those that must be ignored for JSON serialization.
In the following example, the "setFieldNames" signature is overloaded to provide one signature that accepts a list and one that accepts a varargs string. Notice the annotations in use:
package org.foo;

import org.codehaus.jackson.annotate.JsonIgnore;
import org.codehaus.jackson.annotate.JsonProperty;

public class MyOperator extends ExecutableOperator {

   private List<String> fieldNames= Collections.emptyList();
 
   //explicitly tell jackson to serialize this property ( normally it
   //would be serialized but because we have an an JsonIgnore,
   //we ignore all properties of the same name )
   @JsonProperty
   public List<String> getFieldNames() {
      return Collections.unmodifiableList(fieldNames);
   }
 
   //explicitly tell jackson to deserialize via this property ( normally it
   //would be deserialized but because we have an an JsonIgnore,
   //we ignore all properties of the same name )
   @JsonProperty
   public void setFieldNames(List<String> fieldNames) {
      this.fieldNames = new ArrayList<String>(fieldNames);
   }
 
   //tell jackson to ignore this property
   @JsonIgnore
   public void setFieldNames(String... fieldNames) {
      setFieldNames(Arrays.asList(fieldNames));
   }
 
   ...

}
Constructing Using Non-Default Constructor
While operators themselves are expected to define a default constructor and have getters and setters for all properties, the properties themselves may be of complex, immutable types that do not define a custom constructor. In this case, the "@JsonCreator" annotation can be used:
import org.codehaus.jackson.annotate.JsonCreator;
import org.codehaus.jackson.annotate.JsonProperty;

public final class Pair {

    private final String first;
    private final String second;
 
    @JsonCreator
    public Pair(@JsonProperty("first") String first, @JsonProperty("second") String second) {
        this.first = first;
        this.second = second;
    }
 
    public String getFirst() {
        return this.first;
    }
 
    public String getFirst() {
        return this.second;
    }
}
The previous class can now serialize/deserialize as the following JSON:
{
    "first" : "<first-value>",
    "second" : "<second-value>",
}