Name | Parameters | Supported Data Types | Remarks |
---|---|---|---|
Distinct Values Writes all the distinct (unique) values found for the filed into an output file | • Results File - Displays the distinct values output file. Default is within your project. However, you can change it. • Sort Order Select the required Sort Order from the following: – Field Data (Faster) – Frequency Count (Slower) Note: Frequency Count (Slower) sorts the results in descending order and hence the sorting is slower compared to Field Data sort option. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | The results file name must not contain the following characters: • / • \ • . • , • .. |
Duplicate Values Writes all duplicate values found in a data set into a file. | • Results File - Displays the duplicate values output file. Default is within your project. However, you can change it. • Sort by Select the Sort Order from the drop-down menu: – Field Data (Faster) – Frequency Count (Slower). Note: Frequency Count (Slower) sorts the results in descending order and hence the sorting is slower compared to Field Data sort option. • Min Count Type the required value. The value must not be lower than 2. The default Min Count is 2. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | The results file name must not contain the following characters: • / • \ • . • , • .. Also, this rule is included as a part of Single Statistics rule. Using Single statics rule includes Min, Max, Mean, Median, Mode, Standard Deviation, Sum, and Variance. |
Single Statistics | The following statistics are available if you select this rule: • Maximum - Calculates the maximum value of a field • Minimum - Calculates the minimum value of a field • Median - Calculates the median (middle of a sorted list) value of the field. • Mode - Calculates the mode (most frequently occurring value) of the field. • Standard Deviation - Estimates standard deviation, a measure of how widely the values are separated from the average value (the mean). • Sum - Calculate the sum of a field. • Variance - Indicates how possible values are spread around the expected value. Numeric datatypes include all the above rule. However, the string datatypes include only maximum, minimum, median, and mode. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | This rule allows you to include several summary rule at once instead of adding one rule at a time. |
Equal Range Binning Constructs a histogram and defines an equal range for each bin. | Top How Many: Calculates equal width ranges across the field values and counts the number of values in each range. RangeCount Specify the number of ranges to create. The default is 10. | All numeric types supported as input: • Numeric • Double • Float • Long • Integer | For more information about this rule, see Equal Range Binning. |
Maximum Calculates the maximum value of a field and supports String types as a separate rule. However, Numeric, Double, Float, Long, and Integer datatypes are included as part of the Statistics rule. | None | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Median Calculates the median (middle of a sorted list) value of the field. | None | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Minimum Calculates the minimum value of a field. | None | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Mode Calculates the mode (most frequently occurring value) of the field. | None | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Most Frequent Values Calculates the most frequent values of a field. | Type a value in the Top How Many text box for the number of frequent values. The default is 25, which returns the top 25 most frequent values. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | This rule replaces some of the functionality of the Distinct Values rule. |
Standard Deviation Estimates standard deviation, a measure of how widely values are dispersed from the average value (the mean). | None | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | Standard deviation equals the square root of the variance. |
Sum Calculate the sum of a field. | None | All numeric types supported as input: • Numeric • Double • Float • Long • Integer | |
Variance Indicates how possible values are spread around the expected value. | None | All numeric types supported as input: • Numeric • Double • Float • Long • Integer | The square root of the variance equals the standard deviation. |
Rule Name | Parameter Option Name or Value | Supported Datatypes | Remarks |
---|---|---|---|
Compare to Constant After comparing a field value to a constant value, it returns true or false result. | Perform the following to select the rule parameters: 1. Select Invert Test check box if you want to invert a rule test or report the opposite results in your pass or fail file. 2. Select one of the following from the Operator drop-down list: • Equal • Lesser or Equal • Greater • Not Equal • Greater or Equal Lesser. 3. In Constant text box, type the constant value to compare. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | Following are the notes: • The constant must be of the same data type as the profiled field. • The equal operator is not recommended to use on double or float field comparisons. • Do not use quotation marks with string data. The following formats are supported for date and timestamp: • yyyy-mm-dd • yyyy-mm-dd hh:mm:ss (military time) This rule provides a method for inverting the results to pass and fail files. For more information about inverting rule, see Inverting Rule Test. |
Compare to Field After comparing a field value to another field, returns true or false result. | Perform the following to select the rule parameters: 1. Select Invert Test check box if you want to invert a rule test or report the opposite results in your pass or fail file. 2. Select one of the following from the Operator drop-down list: • Equal • Lesser or Equal • Greater • Not Equal • Greater or Equal • Lesser 3. Select the required field to compare from the Compare Field drop-drown list. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | The two fields must be of the same data type. The equal operator is not recommended for use on double or float field comparisons. This rule provides a method for inverting the results to pass and fail files. For more information, see Inverting Rule Test. |
Is Blank If a field is blank, then returns true. Else, it returns false. | Parameters are not available for this rule. | String | Counts null, white space, and empty strings as blank. |
Is Duplicate If a field is duplicated, then returns true. Else, returns false. | Parameters are not available for this rule. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | The results file name cannot contain the following characters: • / • \ • . • .. |
Is Not Blank If a field is not blank, then returns true. Else, it returns false. | Parameters are not available for this rule. | String | Counts null, white space, and empty strings as blank. |
Is Not Null If a field is not null, then returns true. Else, returns false. | Parameters are not available for this rule. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Is Null If a field is null, then returns true. Else, returns false. | Parameters are not available for this rule. | • String • Date • Timestamp • Boolean • Numeric • Double • Float • Long • Integer | |
Matches Regex | To select the rule parameters: 1. Select Invert Test check box if you want to invert a rule test or report the opposite results in your pass or fail file. 2. Select from a list of prepopulated regular expressions or specify your own regular expression. The prepopulated list contains regular expressions for the following: • AlphaNumeric Strings • AnyString • ASCII characters only • Canada Postal Code • Credit Card Number • Date - mm/dd/yyyy format • Date - YYMMDD format • Days of Week • Digits Only • Email Address • ISBN_10 Number • Male or Female • Mr or Mrs • Printable Characters • Single Alphabetic character • Social Security Number • Time - hh:mm AMPM format • Time - hh:mm:ss format • UK Postal Code • "US Currency • "US Phone Number • "US Zipcode or Zip+4 • Vehicle Identification Number (VIN) AnyString is the default regular expression. For more information about regular expressions, see Regular Expression and Definitions. | String | Regex pattern rule is displayed in the Parameters column on the Profile tab and the Description column is displayed on the Result summary. DataConnect stores and uses regular expressions. You can edit the expressions if required. |
Name | Value | Description |
---|---|---|
AlphaNumeric | ^[a-zA-Z0-9]+$ | Matches strings that contain character and/or numeric data. |
AnyString | .* | Matches all strings. |
ASCII characters only Postal | ^([a-zA-Z0-9!\"#$%&',/:;<=>@_`|~ \\(\\)\\*\\+\\-\\.\\?\\[\\]\\\\\\^\\{\\}])*$") | Matches ASCII characters only postal. |
Canada Postal Code | ^([A-Za-z]\d[A-Za-z][\s-]?\d[A-Zaz]\ d) | Matches postal code for Canada. |
Credit Card Number | [1-9][0-9]{3} [0-9]{4} [0-9]{4} [0-9]{4} | Matches a credit card number. |
Date - mm/dd/yyyy format | ^(3[0-1]|2[0-9]|1[0-9]|0[1-9])[\s{1}|\/ |-] (Jan|JAN|Feb|FEB|Mar|MAR|Apr|A PR|May|MAY|Jun|JUN|Jul|JUL|Aug |AUG|Sep|SEP|Oct|OCT|Nov|NOV| Dec|DEC)[\s{1}|\/|-]\d{4}$ | Flexible date validator that matches most date formats. |
Date_YYMMDD | ^((\d{2}((0[13578]|1[02])(0[1- 9]|[12]\d|3[01])|(0[13456789]|1[012] )(0[1-9] |[12]\d|30)|02(0[1- 9]|1\d|2[0- 8])))|([02468][048]|[13579][26])022 9)$ | Date expression validator with format YYMMDD. |
Days of Week | ^(Sun|Mon|(T(ues|hurs))|Fri)(day|\.) ?$|Wed(\.|nesday)?$|Sat(\.|urday)? $|T ((ue?)|(hu?r?))\.?$ | Matches days of the week or their abbreviations. |
DigitsOnly | \d+ | Expression for matching one or more digits. |
EmailAddress | ^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\\- +)|([A-Za-z0-9]+\\.+)|([A-Za-z0-9] +\\++))*[A-Za-z0-9]+@((\\w+\\- +)|(\\w+\\.))*\\w{1,63}\\.[a-zAZ]{ 2,6}$ | Matches an email address. |
ISBN_10 Number | ISBN\x20(?=.{13}$)\d{1,5}([- ])\d{1,7}\1\d{1,6}\1(\d|X)$ | Matches format of an ISBN number. |
Male Or Female | M|F | Matches M or F character. |
Mr or Mrs | Mr|Mrs | Matches Mr or Mrs characters. |
Printable Characters | ^([a-zA-Z0- 9!@#$%^&amp;*()- _=+;:'&quot;|~`&lt;& gt;?/{}] {1,5})$ | Matches all printable characters. |
Single Alphabetic character | ^[a-zA-Z]$ | Matches a single alphabetic Character. |
Social Security Number | ^[0-9]{3}-?[0-9]{2}-?[0-9]{4}$ | Matches U.S. social security number. Dashes (-) are optional. |
Time - hh:mm AMPM format | ^ *(1[0-2]|[1-9]):[0-5][0-9] *(a|p|A|P)(m|M) *$ | Matches time format hh/mm AM. |
Time - hh:mm:ss format | (([0-1][0-9])|([2][0-3])):([0-5][0- 9]):([0-5][0-9]) | Matches time format hh:mm:ss. |
US Zipcode or Zip+4 | [0-9]{5}(-[0-9]{4})? | Matches a US zip code. |
Vehicle Identification Number (VIN) | ^(([a-h,A-H,j-n,J-N,p-z,P-Z,0-9]{9})([a-h,A-H,j-n,J-N,p,P,r-t,R-T,v-z,V-Z,0-9])([a -h,A-H,j-n,J-N,p-z,P-Z,0-9])(\d{6}))$ | Matches US Vehicle Identification Numbers (VINs). |
Name and Description | Parameters | Supported Data Types | Remarks |
---|---|---|---|
String to Date Converts a string input field value to a date value and populates an output field with the converted value. Also, when this rule is used in a profile, it generates a derived field. Hence, the field parameter is associated with this rule. For more information, see Derived Fields. | 1. Click the Field Name drop-down and select a field for the input. Note: A default name is provided. However, you can edit or overwrite. 2. Click the Metric Type drop-down and select String to Date. 3. In the Parameters section, a default value is provided for the derived field associated with this rule. However, you can edit or overwrite if required. 4. Click the Date Pattern drop-down and select the required pattern. options are available: • yyyy-M-dd • yyyy-MM-dd • MM/dd/yy • M/dd/yyyy • MM/dd/yyyy • M-dd-yyyy • MM-dd-yyyy • MMM-yy • MMM dd, yyyy Default is yyyy-MM-dd. Note: You can also specify your own pattern in the text box. You can use M for month, y for year, and d for date. The number of symbols vary based on the short or long form of the date part. | • String (input) • Date (output) | Provides pass and fail statistics and generates an output field of type Date. Supports custom date patterns. This rule uses the Java SimpleDateFormat class. If the number of pattern letters are more than two, then the year is interpreted literally. For example, using the pattern "MM/dd/yyyy", "01/11/12" parses to Jan. 11, 12 A.D. For more information, see Derived Fields. |
String to Double Converts a string input field to a double value populating an output field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. | • String (input) • Double (output) | Provides pass and fail statistics and generates a derived field of type Double. For more information, see Derived Fields. |
String to Float Converts a string input field to a float value populating an output field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. | • String (input) • Float (output) | Provides pass and fail statistics and generates a derived field of type Float. For more information, see Derived Fields. |
String to Integer Converts a string input field to an integer value populating an output field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. | • String (input) • Integer (output) | Provides pass and fail statistics and generates a derived field of type Integer. For more information, see Derived Fields. |
String to Long Converts a string input field to a long value populating an output field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. | • String (input) • Long (output) | Provides pass and fail statistics and generates a derived field of type Long. |
String to Numeric Converts a string input field to a numeric value in an output field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. For all supported special characters, see String to Numeric Pattern Parameter. | • String (input) • Numeric (output) | Provides pass and fail statistics and generates a derived field of type Numeric. Comma separated values are supported. Output value is not formatted to match the pattern specified. For this functionality, see Match Regex. This rule uses the Java text.DecimalFormat. |
String to Time Converts a string field to a time value populating an output field. | 1. Click the Field text box and provide a name for the input field. Note: A default name is provided. However, you can edit or overwrite if required. 2. Click the Time Pattern drop-down and select the required pattern. The following options are available: • HH:mm:ss • HH:mm • hh:mm:ss • hh:mm aa Default is hh:mm aa. | • String (input) • Time (output) | Provides pass and fail statistics and generates a derived field of type Time. For more information, see Derived Fields. |
String to TimeStamp Converts a string input field to a timestamp value in an output field. | 1. Click the Field text box and provide a name for the input field. Note: A default name is provided. However, you can edit or overwrite if required. 2. Click the TimeStamp Pattern drop-down and select the required pattern. The following options are available: • yyyy-MM-dd HH:mm:ss • MM/dd/yyyy HH:mm • MMM dd, yyyy HH:mm aa • EEE, dd MMM yyyy HH:mm:ss Z Default is yyyy-MM-dd HH:mm:ss. Note: - You can also specify your own pattern in the text box. You can use M for month, y for year, and d for date. The number of symbols vary based on the short or long form of the date part. - ISO 8601 format should be mentioned as yyyy-MM-dd'T'HH:mm:ss+hh:mm. T in this format must be enclosed within single quotes, 'T'. | • String (input) • TimeStamp (output) | Provides pass and fail statistics and generates a derived field of type TimeStamp. For more information, see Derived Fields. Supports custom timestamp patterns. This rule uses the Java SimpleDateFormat class. |
Name | Parameters | Supported Data Types |
---|---|---|
String Length Calculates the length of a string field populating a derived field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. String rule also produces a derived field. For more information, see Derived Fields. | • String (input) • Number (output) |
String Trim Trims whitespace from a string field populating a derived field. | A default name is provided. However, you can edit or overwrite the name by clicking the Field text box. String rule also produces a derived field. For more information, see Derived Fields. | • String (input) • String (output) |