Formats and Schemas : Schemas
 
Share this page                  
Schemas
A schema describes the fields of a record, specifically their names and data types. While a format will always have a schema, the converse is not true; a schema does not have sufficient data to describe a format.
Schemas are used to map record values in files to record values in DataFlow. Once a format has identified the bytes or characters comprising a field, the schema is used to convert those into a value which can be processed by other operators.
DataFlow currently only supports text-based schemas. Text schemas can be fixed- or variable-width; the format determines which of the two is appropriate.
Schemas are similar to the record token types described in Record Token Types. Both define a mapping between field names and data types. Each mapping identifies a single field in the record. The mappings are also ordered. In the same way a format has an associated schema, a schema has an associated record token type. The difference between the two is that the data types in schemas are external types, not token types.