User Guide > Setting Preferences > Setting Data Profile Preferences
Was this helpful?
Setting Data Profile Preferences
DataConnect Data Profile Preferences allow you to control whether to retain the source data order in targets and specify where to store the rules file. The Ruleset Files Location preference defines where rules are saved as a .ruleset file. This option:
Enables you to reuse saved rules across different profiles.
Specifies the default location for browse dialogs when loading or saving rules.
By centralizing your rules in one location, the Ruleset Files Location preference eliminates the need to recreate rules. You can export rules directly from a profile to create a .ruleset file, which can also be manually edited if needed.
Data Profile Alert Management Preferences let you configure key settings for the alert notification feature, including the threshold value, SMTP server details, and subscriber email addresses. These settings can be defined globally in the Preferences or individually within each Data Profile.
To apply the same settings to all new data profiles by default, configure them in the Preferences. If you want to customize alert settings for a specific profile, you can override the global settings by configuring them directly in that profile under Execution Options (see Configuring Execution Options).
To set the Data Profile preferences:
1. Go to Options > Preferences.
The Preferences dialog is displayed.
2. In the left-pane, expand DataConnect and click Data Profiler.
3. Select or unselect the Retain Source Data Order in Targets checkbox.
This option allows you to retain source data order in the four pre-configured targets or output files (PASS_TARGET, FAIL_TARGET, DRILLDOWN_TARGET, STATS_TARGET). Unless this option is selected, data written to the targets will be in random order. Default is selected.
4. Specify the required Ruleset files location:
Ruleset Files Location - Click Browse to choose the location where the .ruleset file will be saved. This is the location where the Ruleset Export/Import feature will automatically search (see Export Ruleset and Import Ruleset). The default location is <userdirectory>/Actian/DataConnect/DP_Rulesets.
For information about general Browse button behavior and rules, see Browsing Files and Directories.
5. Specify the Data Discovery preferences:
Data Discovery On - Check this option to enable Field Data Discovery. When enabled, data discovery will automatically run when the user connects to the source file during profile creation or opens an existing profile. If left unchecked, data discovery will not run automatically. However, you can manually trigger it by navigating to the Rules tab in the profile editor, selecting a source Field Name under the Field/Rule pane, and then clicking the Run button () in the Data Discovery pane.
Note:  Data Discovery will always run in the new Data Profile wizard, regardless of the preference setting.
Discovery Sample Size - This dropdown lets you define the sample size of the source data used for Field Data Discovery. It's particularly useful when working with large datasets, as using a sample size can improve performance while designing a profile and configuring profile rules. The default setting is 10,000 records, but you can adjust it to 1000, 5000, or 25,000 records as needed.
6. Specify the Inspect and Recommend preferences:
Inspect and Recommend Sample Size - The sample size of the source data that is used for Inspection. This option is useful when working with very large datasets, which can impact performance. Using a smaller sample improves speed, while a larger sample provides more accurate rule matching across the full dataset. The default setting is 5000 records but can be changed to 1000, 10000, or 25000.
Inspect and Recommend Efficiency Percentage - This dropdown allows you to set an Efficiency Percentage, which determines when rules will be recommended. A rule will only be recommended if the pass percentage exceeds the defined efficiency threshold. The default value is 50%. Setting the efficiency to 30% allows for more leniency, meaning rules will be recommended more often. Setting it to 60% makes the rule less likely to be recommended, as it will only be considered if 60% of records pass the rule.
Inspect and Recommend Ruleset Location - Click Browse to choose the location where the .ruleset file will be saved. This is the location referenced by the Inspect and Recommend feature when inspecting rulesets (see Inspecting Data and Recommend Rules Wizard). The default location is <userdirectory>/Actian/DataConnect/DP_Rulesets.
For information about general Browse button behavior and rules, see Browsing Files and Directories.
7. Set the following Data Profile Alert Management Preferences:
Set the Threshold Data Quality Index. Enter a value between 0 and 100, for example, 80.0. When the DQI falls below the threshold you specify, an email notification is sent to subscribers.
A value of 100 triggers notifications for any imperfection in data quality.
A value of 0 disables notifications.
Specify the SMTP server details by clicking the icon on the far right, above the subscriber's grid.
In the dialog box that appears, enter the following values:
User Name – A valid email address which will be used as the FROM email for all alerting, and used as the user name to log in to the SMTP server.
Password – The password to log in to the SMTP server.
Host – The domain name or IP address of the SMTP server. For example, smtp.gmail.com.
Port Number – The SMTP Server port number.
Enable TLS – Select True to use the TLS protocol or False to use the SSL protocol.
Note:  After you specify the SMTP Server details, the note asking you to specify the SMTP server details will no longer be displayed.
Specify the Subscriber’s Email by clicking the icon on the far right, above the subscriber's grid:
Subscriber’s Email – Enter the email address for each person you wish to receive alert email notifications.
(Click to delete an email address.)
Subscriber Choice – Select one:
Message Only – Select to only receive a message in the email.
Message and statistics report – Select to receive a message and also a statistics.csv file which shows where the data quality problems are.
8. Click Apply or Apply and Close to save the changes.
Last modified date: 09/22/2025