Property Name | Type | Description |
type | String | The type of schema to create. By default, a variable length schema is created (useful for delimited text). To create a fixed-width text schema that can be used with the read/write fixed text operators, set to FIXED. |
nullable | boolean | Indicates whether empty field values are treated as null values. |
nullIndicator | String | Sets the text that represents a null value within a field. This value is used for all fields by default. Each field definition can override the null indicator for the field. |
padChar | char | The character used to pad-fill field values as needed. This is only applicable for fixed-width schemas. The default value is a space character. |
alignment | String | The alignment to use when formatting fixed text. This is only applicable for fixed-width schemas. The valid values are CENTER, LEFT, and RIGHT. Default: LEFT. |
Function Name | Input Parameters | Description |
alignment | • String | The alignment to use when formatting fixed-width text. This is only applicable for fixed-width schemas. The valid values are CENTER, LEFT, and RIGHT. Default: LEFT. |
BOOLEAN | • Field name • Properties (optional) | Creates a new field of type BOOLEAN. |
DATE | • Field name • Properties (optional) | Creates a new field of type DATE. |
DOUBLE | • Field name • Properties (optional) | Creates a new field of type DOUBLE. |
ENUM (deprecated-use STRING) | • Field name • Enumerated values | Creates a new field of type ENUM. |
ENUM (deprecated-use STRING) | • Field name • Properties • Enumerated values | Creates a new field of type ENUM. The provided values are used to build the enumerated type. |
FLOAT | • Field name • Properties (optional) | Creates a new field of type FLOAT. |
INT | • Field name • Properties (optional) | Creates a new field of type INT. |
load | Local pathname | Loads a predefined schema from a local file. |
LONG | • Field name • Properties (optional) | Creates a new field of type LONG. |
nullable | boolean | Indicates whether or not empty field values are treated as null values. |
nullIndicator | String value | Sets the text that represents a null value within a field. This value is used for all fields by default. Each field definition can override the null indicator for the field. |
NUMERIC | • Field name • Properties (optional) | Creates a new field of type NUMERIC. |
padChar | • char | The character used to pad fill field values as needed. This is only applicable for fixed-width schemas. Default: space character. |
store | Local pathname | Stores the current schema definition into a local file. The schema file can be used later within scripting or within the RushAnalyzer environment. |
STRING | • Field name • Properties (optional) • Allowed values (optional) | Creates a new field of type STRING. If allowed values are provided, the values are added to the domain of the field. |
TIMEOFDAY | • Field name • Properties (optional) | Creates a new field of type TIME. |
TIMESTAMP | • Field name • Properties (optional) | Creates a new field of type TIMESTAMP. |
trimmed | boolean | Enable or disable string trimming if input field values. |
Property Name | Type | Supported on Types | Description |
alignment | String | all types (for fixed-width text) | The alignment to use when formatting fixed-width text. This is only applicable for fixed-width schemas. The valid values are CENTER, LEFT, and RIGHT. Default: LEFT. |
falseValue | String | boolean | The text that represents boolean false. |
nullIndicator | String | all types | Text value that represents a null value within a field. Common settings are: NA, ?, <null> for example. Default: empty string. |
nullable | boolean | all types | A boolean value that indicates if the field may contain null values. Default: true. |
padChar | char | all types (for fixed-width text) | Specifies the character to use for padding when formatting fields for fixed-width text. Default: space character. |
pattern | String | date, double, float, int, long, numeric, time of day, timestamp | Specifies how to parse and format field values. The pattern is type specific, that is, a pattern for a date fields is different from a pattern for floating point values. See the documentation immediately following this table for more information about patterns. |
size | int | all types (for fixed-width text) | The fixed size of a field within a schema. This property is only supported for fixed-width schemas. There is no default value. When building fixed-width schemas, the size parameter must be provided on all fields. |
trimmed | boolean | String | A boolean value that indicates whether or not to trim white space from field values. The default value is false. |
trueValue | String | boolean | The text that represents boolean true. |
Symbol | Meaning |
0 | Digit, zero-filled |
# | Digit, zero values are absent |
. | Decimal separator |
, | Group separator |
E | In scientific notation, separates the mantissa and the exponent. Does not need to be quoted. |
; | Separates positive and negative subpatterns |
% | Multiplies by 100 and show as percentage |
' | Used to quote special characters |
Symbol | Meaning |
G | Era designator |
y | Year |
M | Month in year |
w | Week in year |
D | Day in year |
d | Day in month |
e | Day of week (as number) |
E | Day in week (as text) |
a | am/pm marker |
H | Hour in day (0-23) |
k | Hour in day (1-24) |
K | Hour in am/pm (0-11) |
h | Hour in am/pm (1-12) |
m | Minute in hour |
s | Second in minute |
S | Millisecond |
z | Time zone (General) |
Z | Time zone (RFC 822) |
Enumerated Type | Values | Description |
ARFFMode | • SPARSE • DENSE |
Using the WriteARFF Operator to Write Sparse Data to specify the output format to use when writing data. |
DatasetStorageFormat | • COMPACT_ROW • COLUMNAR |
Using the WriteStagingDataset Operator to Write Staging Data Sets to determine the format used to store data. |
DetailLevel | • SINGLE_PASS_ONLY • MULTI_PASS |
Using the SummaryStatistics Operator to Calculate Data Statistics to define the detail level desired. |
DistanceMeasure | • EUCLIDEAN • COSINE_SIMILARITY |
Using the KMeans Operator to Compute K-Means to specify the distance measure. |
JoinMode | • INNER • FULL_OUTER • LEFT_OUTER • RIGHT_OUTER |
Using the Join Operator to Do Standard Relational Joins to specify the type of join to perform. |
NormalizationMethod | • none • logit | Used within PMML to define the normalization function applied to modeling applications. |
NormalizeMethod | • MINMAX • ZSCORE |
Using the NormalizeValues Operator to Normalize Values to specify the type of normalization to apply. |
OutputMode | • APPEND • OVERWRITEROWS • OVERWRITETABLE • UPDATE • DELETE |
Using the WriteToJDBC Operator to Write to Databases to specify the mode of writing to the target table. |
ParseErrorAction | • ERROR • WARN_AND_DISCARD • WARN • DISCARD • IGNORE | Describes the possible actions for handling record parsing errors. Used by file reader operators such as
Using the ReadDelimitedText Operator to Read Delimited Text. |
RankMode | • STANDARD • DENSE • ORDINAL |
Using the Rank Operator to Rank Data to specify the type of ranking to use. |
SampleMode | • BY_PERCENT • BY_SIZE |
Using the SampleRandomRows Operator to Sample Data to specify the type of sampling to apply. |
StringConversion | • RAW • NULLABLE_RAW • TRIMMED • NULLABLE_TRIMMED | Enumerates the possible conversions for string-valued text types. |
UnreadableSourceAction | • IGNORE • WARN • FAIL | Specifies the behavior for handling data sources which are unreadable. |
WriteMode | • CREATE_NEW • OVERWRITE • APPEND | Used by file writer operators such as
Using the WriteDelimitedText Operator to Write Delimited Text,
Using the WriteARFF Operator to Write Sparse Data, and others to specify how to handle creating the output files for writing. |
Class Name | Description |
Aggregation | Provides methods for creating the aggregations to perform
Using the Group Operator to Compute Aggregations. |
Arithmetic | Provides methods for creating functions that perform arithmetic operations. |
Conditionals | Provides methods for creating conditional functions. |
ConstantReference | Provides methods for creating functions that provide constant values. |
Conversions | Provides methods for creating functions that provide data type conversions. |
DateTime | Provides methods for creating functions that process date and timestamp data types. |
DateTimeValue | Provides values that are needed by the
DateTime functions. |
FieldDerivation | Provides methods for creating field derivation specifications by
Using the DeriveFields Operator to Compute New Fields. |
FieldReference | Provides methods for creating functions that access fields. |
Formatting | Provides methods for creating functions that provide data formatting functionality. |
Predicates | Provides methods for creating functions that express predicate conditions. |
ReplaceSpecification | Provides methods for creating missing value handling specifications by
Using the ReplaceMissingValues Operator to Replace Missing Values. |
ScalarType | Defines the scalar types available within the DataFlow framework. |
SortKey | Provides methods for creating
SortKey objects used with
Using the Sort Operator to Sort Data Sets. |
Statistics | Provides methods for creating functions that provide compute common statistics. |
Strings | Provides methods for creating functions that provide common String utilities. |
Tolerance | Provides methods for creating floating point tolerance types used with various assertion operators such as
Using the AssertEqual Operator to Assert Data Equality. |