Overview of Design Environment
Use the Design environment to establish a connection, create, test, schedule and execute data profiles, and view profile execution results.
To establish a connection, you can either choose from a list of connectors (see
Source and Target Connections), establish a new connection (see
Creating a Connection), or utilize an existing one (see
Define Source).
To create a data profile you create a source (see
Define Source), create data quality rules (see
Define Rules and Analyze Results), and define a target (see
Define Targets). Data isn’t persisted until the target tables are defined.
The data quality rules can be configured to enforce specific conditions or values. When a rule is applied to a specific field in a profile, the data in that field is evaluated against the condition or value within the rule. After processing is complete, each value from the source field is categorized as either valid or invalid. Valid data is written to the Pass target, while invalid data is written to the Fail target.
Profile rules can be added to a profile manually (see
Add Rule from Rule Set), or using the
Inspect & Recommend Rules option.
During profile design profiles can be executed manually (see
Run a Profile Manually), or set to execute after any rule change (add, remove, edit, disable, or enable a rule in the profile). Consider executing a profile manually if the source dataset is large and requires a lot of processing time (see
Run a Profile Manually). If the source dataset is smaller, consider enabling the
Run profile after every rule update option so you can automatically get updated results and insights.
You can also test and adjust rules using a subset of your source data, or “
Sample Size”. This can be helpful when your source dataset is large and you want to shorten processing time (see
Define Rules and Analyze Results). Once the profile design has been tested and validated, the target tables can be defined in the
Targets page. Profile execution from the
Targets page will cause the profile rules to be executed against the entire dataset (rather than a Sample Size). The profile can be revised if desired (see
Edit a Profile).
Profile execution results, such as Pass/Fail count and execution duration time, are contained in jobs. Each time a profile is executed a job is created. Jobs cannot be edited. Aggregated job results for profiles are shown in the
Run History for all Profiles page. Job results for a single profile are shown in the
Profile Details page (see
View Profile Details).
Optionally, you can schedule a profile to execute at regular intervals. When a profile is scheduled to execute a configuration is automatically created for the profile. The configuration specifies when and where the profile executes. You then manage and edit the configuration associated with the profile (see
Managing Configurations), and monitor execution results in the
Manage environment (see
Run History for all Configurations and
Run History for a Single Configuration). You can also interact with trend graphs which trace job results over a selected period in the
Manage, Overview Page.
Note: Edits made to a profile after the configuration is created are not updated in the configuration. To update the configuration, recreate the profile.
Summary of what users can do in the Design environment:
• Execute a profile. There are three ways to execute a profile. See:
The
Creating a Data Profile page (see below) opens by default in the Design environment. You can edit, schedule, duplicate, execute and delete profiles from this page.
Last modified date: 10/30/2024