Was this helpful?
Inspecting Data and Recommend Rules Wizard
You can use the Inspect Data and Recommend Rules wizard to run internal algorithms that analyze the source data and suggest rules based on the source schema and various data pattern matching tests. This method is recommended if you're unsure about potential issues in your source data.
Ruleset Files can be included in the inspection process. However, doing so may increase the overall inspection time. The ruleset file must be created beforehand (see Export Ruleset).
To use the inspect data and recommend rules wizard:
1. Click (Inspect Data and Recommend Rules) on the Rules tab.
2. A list of source fields is displayed along with their Data Types.
3. Select the source fields to include in the inspection (at least one field must be selected).
You can use the Search box to locate a specific field.
4. Click Next to display the applicable rule types for the selected fields.
The rule types are grouped into categories, such as profiling and remediation.
5. Select the rule types to include in the inspection (at least one rule type must be selected).
Only selected rule types will be considered for recommendation.
6. Click Next to proceed.
7. Additional inspection and recommendation options are shown. Select or enter the following:
Sample Size: The sample size of the source data that is used for Inspection. This option is useful when working with very large datasets, which can impact performance. Using a smaller sample improves speed, while a larger sample provides more accurate rule matching across the full dataset. The default setting is 5000 records but can be changed to 1000, 10000, or 25000.
Efficiency Percentage: Efficiency Percentage determines when rules will be recommended. A rule will only be recommended if the pass percentage exceeds the efficiency test. You can set this value anywhere between 1 and 100. The default value is 50%. Setting the efficiency to 30% allows for more leniency, meaning rules will be recommended more often. Setting it to 60% makes the rule less likely to be recommended, as it will only be considered if 60% of records pass the efficiency test. Efficiency tests may vary depending on the type of rule being applied. The following table shows how efficiency percentages are calculated:
Rule
Efficiency Percentage Calculation
IsNotBlank
% of non-blank records
IsNotNull
% of non-null records
CompareToConstant
% of records containing the most common value
ReplaceValue (Boolean only)
% of valid records (various formats of true/false)
IsNotDuplicate
% of unique records
ApplyTimezone
N/A (recommended for all TimeStamp fields)
MatchesRegex
% of records matching most common regex
ChangeStringCase
% of records with the most common string case
RemoveChars
% of records without removedChar (digit, whitespace, special chars → each are recommended in their own rule)
StringTrim
% of records without leading/trailing whitespace
ChangeFormat
% of convertible records (records with recognized formats that can be converted)
StandardizeFormat
% of convertible records (records with recognized formats that can be converted)
Enable Ruleset Inspection: (Optional) Select this to enable inspection of rules from a ruleset file.
Ruleset File Location: (If applicable) Click Browse and select the location of the .ruleset file.
When you enable Ruleset Inspection, you can select a directory containing available rulesets. On the next page, you'll be able to choose specific rulesets within that directory for inspection. The system will then analyze the selected rulesets using the "Inspect and Recommend" feature.
During inspection, ruleset rules are categorized into three types for recommendation purposes: Inspection Supported Profiling Rules, Other Inspection Supported Rules, and Unsupported Rules. Profiling rules generate a pass/fail percentage during profiling and are recommended if this percentage meets the defined efficiency criteria, regardless of whether they are supported by Inspect and Recommend. Non-profiling, inspection-supported rules must be selected on the MetricType (Rule Type) selection page and are recommended if their pass percentage (based on Data Discovery stats) meets the criteria. Unsupported rules, which cannot be generated by Inspect and Recommend or selected through the MetricType page, are automatically recommended when a matching field is found.
The default browse location is what has been set in the Ruleset Files Location preferences (see Setting Data Profile Preferences). This location setting is used when loading or saving ruleset files through browse dialogs.
8. Click Next, then Click Start Inspection.
The profile runs during Inspect and Recommend. The system uses internal algorithms which inspect the data and add rules based on knowledge of the source schema and efficiency percentage test. A matching algorithm compares the source dataset to identify matches with the Data Types and Field Name Patterns from the selected ruleset file rules. This matching process ensures that only valid rules are applied to the corresponding fields in the source dataset. The recommended rules are ordered from highest efficiency percentage to lowest efficiency percentage.
A default rule name is provided which follows the following format:
<FieldName>_<RuleType> for example, City_IsNotBlank (when recommending rules from the sample source data)
ruleset_<FieldName>_<RuleType> for example, MyRuleset_City_IsNotBlank (when recommending rules from a ruleset file)
Note:  Rule names cannot begin with a digit. If a field or column in the source data starts with a digit, 'r_' will be prepended to any rules created based on that field. This prefix will also be added to any rule name that does not start with an alphabetic character.
9. Select the rules you want to keep and click Finish.
The Inspect Data and Recommend Rules wizard closes, and the selected rules are added to the existing rules of the profile, with duplicates automatically ignored.
Note:  If you have an existing rule (one of the four listed commonly used rules) for a rule that was returned by the inspection, the existing rule is kept and the recommended rule is ignored. Any missing field is added to the existing rule.
The following rules have been implemented for the inspect and recommend feature:
 
Last modified date: 09/22/2025