User Guide > Designing and Executing Data Profile > Creating a Profile Using Data Profile Wizard
Was this helpful?
Creating a Profile Using Data Profile Wizard
The New Data Profile Wizard, allows you to create a new data profile by defining a source, specifying testing rules, selecting a field to assess, and defining a target where the pass and fail information is stored (only if you want to change the pre-configured target locations). The wizard pages include:
1. New Data profile page – Select a project and enter a name for the new data profile file.
2. Define Source page – Provide the source connection details.
3. Create Data Quality Rules page – Specify testing rules and select a field to test.
4. Configure Targets – All targets are pre-configured but you can define the pass and fail target connection details (only if you want to change the pre-configured target locations).
5. Data Profiler Summary page – Provides a summary of information about the new data profile that you are about to create.
Navigation between pages of the wizard is possible by using the Back and Next buttons at the bottom of the page. Navigating between pages will not clear data that has been entered by the user. When navigation to other pages is not possible, the button will fade in color and will not be active. You can exit at any time by clicking Finish and what you have configured is saved. You can later edit the data profile file in the Data Profile Editor. You can also exit the process anytime without saving any information by clicking Cancel.
Note:  You can also create a profile without using the New Data Profile wizard. For more information, see Creating a Profile Without Using Wizard.
To create a data profile using the New Data Profile Wizard:
1. Select a DataConnect project and do any of the following:
Go to File > New > Data Profile.
Click the arrow in /download/attachments/24975419/ProjectExplorer_New_Icon.png?version=1&modificationDate=1487964007993&api=v2 and then click Data Profile.
Right-click on the project and then click New > Data Profile.
The New Data Profile Wizard is displayed.
2. Select the project in which you want to create the data profile file.
3. In the Profile File Name field, type a name for the new data profile, and click Next.
The Define Source page is displayed.
4. In the Source Connection section do one of the following:
From the Choose Connector dropdown list, select a source connector.
In Or Connection, click Browse and select a saved (or existing) User Defined Connection (See Saving and Reusing a Connection).
The connector parts are displayed. Also, the selected connector’s properties are displayed on the right.
5. Specify the Source Connection information. For information about the selected connector and its properties, see Map Connectors. For information about source connection, see Specifying Source Connection for Data Profile.
6. Select or unselect the Retain Source Data Order in Targets checkbox.
This option allows you to retain source data order in the four pre-configured targets or output files (PASS_TARGET, FAIL_TARGET, DRILLDOWN_TARGET, STATS_TARGET). Unless this option is selected, data written to the targets will be in random order. Default is selected.
7. Click Next.
The Create Data Quality Rules page is displayed.
8. Define a rule. You have the following options:
Select the default blank rule, Rule_1, and then configure the rule from the Commonly Used Rules list. See Selecting from Commonly Used Rules.
Click (Inspect Data and Auto Add Rules) to use internal algorithms which inspect the source data and recommend rules based on knowledge of the source schema and various data pattern matching tests. See Inspecting Data and Auto Adding Rules using Wizard.
9. Click Next.
The Configure Targets page is displayed.
10. Select a target to view or configure the target connection information.
There are four pre-configured targets or output files (however, you can edit and change the files if required):
PASS_TARGET: This is the generated clean file. It is of the same format as the input file and contains the records from the source dataset that passes the criteria specified in the Data Profile rule. The output can be written to a file, or a JDBC table.
FAIL_TARGET: This is the generated dirty file. It contains the records from the source dataset that do not pass one or more Data Profile rules. The output can be written to a file, or a JDBC table.
DRILLDOWN_TARGET: This file is used to create the stats and charts on the Statistics tab. You can browse this file in the editor to see the relationship between the FAIL_TARGET records and the specified rule.
STATS_TARGET: This file is used to visualize the PASS_TARGET and FAIL_TARGET data in the Statistics tab. These charts display the number and percentage of records that passed or failed the specified rule.
Note:  For more information about output files, see Viewing Pass, Fail, and Drill Down Output.
11. In the Target Connection section do one of the following (only if you want to change the pre-configured target locations):
From the Choose Connector dropdown list, select a target connector.
In Or Connection, click Browse and select an existing target connection file.
The connector parts are displayed. Also, the selected connector’s properties are displayed on the right.
12. Specify the Target Connection information. For information about the selected connector and its properties, see Map Connectors. For information about target connection, see Specifying Target Connection for Data Profile.
13. Click Connect and then click Next.
14. Review the Data Profile Summary page and then click Finish.
The Data Profile Wizard is closed. The data profile file opens in the Data Profile Editor and displays the configured information.
After the Profile is created, it is saved within the specified project, and can be opened and edited in the Data Profile Editor. Data Profile artifacts have a .dp file extension. For information about validating and running data profile, see Validating and Running Profile.
Last modified date: 08/04/2024