Support Vector Machine Operators
A support vector machine (SVM) is a supervised learning model that analyzes data and recognize patterns. It is often used for data classification.
A basic SVM is a nonprobabilistic binary linear classifier, which means it takes a set of input data and predicts, for each given input, which of two possible classes forms the output. In addition, support vector machines can efficiently perform nonlinear classification using what is called the kernel trick, which implicitly maps their inputs into high-dimensional feature space.
DataFlow provides operators to produce and utilize SVM models. The learner is used to determine the classification rules for a particular data set, while the predictor can apply these rules to a data set. For more information, refer to the following topics:
SVMLearner Operator
The
SVMLearner operator is responsible for building a PMML Support Vector Machine model from input data. It is implemented as a wrapper for the LIBSVM library found at
http://www.csie.ntu.edu.tw/%7Ecjlin/libsvm/.
Code Example
This example uses the
SVMLearner operator to train a predictive model base on the Iris data set. It uses the "class" field within the iris data as the target column.
Using the SVMLearner operator in Java
import static com.pervasive.datarush.types.TokenTypeConstant.DOUBLE;
import static com.pervasive.datarush.types.TokenTypeConstant.STRING;
import static com.pervasive.datarush.types.TokenTypeConstant.record;
import java.util.Arrays;
import com.pervasive.datarush.analytics.pmml.WritePMML;
import com.pervasive.datarush.analytics.svm.PolynomialKernelType;
import com.pervasive.datarush.analytics.svm.learner.SVMLearner;
import com.pervasive.datarush.analytics.svm.learner.SVMTypeCSvc;
import com.pervasive.datarush.graphs.LogicalGraph;
import com.pervasive.datarush.graphs.LogicalGraphFactory;
import com.pervasive.datarush.operators.io.textfile.ReadDelimitedText;
import com.pervasive.datarush.schema.TextRecord;
import com.pervasive.datarush.types.RecordTokenType;
/**
* Use the SVM learner to train a predictive model based on the iris data set.
*/
public class SVMIris {
public static void main(String[] args) {
// Create an empty logical graph
LogicalGraph graph = LogicalGraphFactory.newLogicalGraph("SVMIris");
// Create a delimited text reader for the Iris training and query data
ReadDelimitedText reader = graph.add(new ReadDelimitedText("data/iris.txt"));
reader.setFieldSeparator(" ");
reader.setHeader(true);
String[] classTypes = {"Iris-setosa", "Iris-versicolor", "Iris-virginica"};
RecordTokenType irisType = record(
DOUBLE("sepal length"),
DOUBLE("sepal width"),
DOUBLE("petal length"),
DOUBLE("petal width"),
DOMAIN(STRING("class"), classTypes);
reader.setSchema(TextRecord.convert(irisType));
// Create a SVMLearner operator
SVMLearner svm = graph.add(new SVMLearner());
svm.setKernelType(new PolynomialKernelType().setGamma(3));
svm.setType(new SVMTypeCSvc("class", 1));
// Connect the reader and the SVM
graph.connect(reader.getOutput(), svm.getInput());
// Write the PMML generated by SVM
WritePMML pmmlWriter = graph.add(new WritePMML("results/polynomial-SVM.pmml"));
graph.connect(svm.getModel(), pmmlWriter.getModel());
// Compile and run the graph
graph.run();
}
}
Using the SVMLearner operator in RushScript
var learningColumns = ["sepal length", "sepal width", "petal length", "petal width"];
var kernel = new PolynomialKernelType().setGamma(3);
var type = new SVMTypeCSvc("class", 1);
var learner = dr.svmLearner(data, {includedColumns:learningColumns, kernelType:kernel, type:type});
Properties
The
SVMLearner operator provides the following properties.
Ports
The
SVMLearner operator provides a single input port.
The
SVMLearner operator provides a single output port.
SVMPredictor Operator
The
SVMPredictor operator applies a previously built Support Vector Machine model to the input data. This supports either CSVC SVMs or one-class SVMs.
We distinguish the two cases by the presence of
PMMLModelSpec.getTargetCols(). If there are zero target columns, it is assumed to be a one-class SVM. Otherwise, there must be exactly one column of type
TokenTypeConstant.STRING, in which case it is a CSVC SVM.
For CSVC SVMs, the PMML is expected to contain Support Vector Machines with
SupportVectorMachine.getTargetCategory() and
SupportVectorMachine.getAlternateTargetCategory() populated. Each of the SVMs are evaluated, adding a vote to either target category or alternate target category. The predicted value is the one that receives the most votes.
For one-class SVMs, the target category and alternate target category will be ignored. The result will either be -1 if the SVM evaluated to a number less than zero or 1 if greater than zero.
Note: This operator is non-parallel.
Code Examples
Example Usage of the SVMPredictor Operator in Java
// Create the SVM predictor operator and add it to a graph
SVMPredictor predictor = graph.add(new SVMPredictor());
// Connect the predictor to an input port and a model source
graph.connect(dataSource.getOutput(), predictor.getInput());
graph.connect(modelSource.getOutput(), predictor.getModel());
// The output of the predictor is available for downstream operators to use
Using the SVMPredictor operator in RushScript
var results = dr.svmPredictor(learner, data);
Properties
The
SVMPredictor operator has no properties.
Ports
The
SVMPredictor operator provides the following input ports.
The
SVMPredictor operator provides a single output port.
Last modified date: 03/10/2025