DC 12.4 | Running Profile

User Guide > Designing and Executing Data Profile > Validating and Running Profile > Running Profile

Was this helpful?

Running Profile

After validating the profile, you can run the profile. Click

, to run the profile.

Execution details are visible in the Logs tab. To open the log file in an external editor, click

If the run is successful, the Results tab displays the status message for each event as it executes. This information can be used for monitoring purposes. The following information is displayed:

• Time: The time stamp for when the message was logged.

• Step Name: Name of the data flow operator for which the statistics are provided. Name of the operator is the full path to the actual operator processing the data in dataflow. One common pattern to notice in the name is the name ending with ".input" which always tells the number of records received by a particular operator for processing. Left most part of the name is the parent operator while the right most part is the last child in the hierarchy which is doing the work. Most common parent operator names which will be seen are:

– DeriveFields - Derives fields into other fields for example, StringToConversion or any internal conversions done by the engine for achieving a use case.

– DataQualityAnalyzer - Performs quality tests using the rules configured by the user. For example, consider IsNotNull - so in this case, the quality tests will check if the field is not null and calculate the percentage of pass/fail.

– PassTargetWriter - Writer instance for writing data to the pass target. This would look like PassTargetWriter.WriteSink in case of text files.

– FailTargetWriter - Writer instance for writing data to the fail target. This would look like FailTargetWriter.WriteSink in case of text files.

– DrilldownTargetWriter - Writer instance for writing data to the drilldown target. DrilldownTarget.WriteSink would tell the number of bytes written to the drilldown file.

• Status: Either executing or completed based on the state of the operator. In case of error or profiling aborted, the status would be Error.

• Error Code: In case of an error/profile abort, this column will be populated with the error code.

• Message: Message displaying statistics about the operator progress. Only five types of messages can be seen

– Execution Started - Execution has started

– <number> bytes read (in case of delimited text/fixed text on the source side)

– <number> records read - shown for all the operators configured in the dataflow graph except for source. The operator name in this case ends with ".input"

– <number> bytes written (in case of delimited text/fixed text on the target side and for drilldown)

– Execution ended in error (in case of an error or profiling aborted)

– Profile Execution Successfully Completed, ...

Note: The progress result is only available when profile is run from UI, executing from invoker and command line will not display these messages.

Note: In case of smaller datasets, when the profile is run for the second time and so on, you may just see only the start and end message.

Viewing Statistics

The Statistics tab displays profile results strictly for test rules. Test rules are Assert, CompareToConstant, FuzzyMatch, IsNotBlank, IsNotDuplicate, IsNotNull, InRange, MatchesRegex, RemoveDuplicates and RemoveDuplicatesFuzzyMatching.

The Statistics tab automatically updates results and opens after profile execution.

If dimensions are configured for the profile (that is, if any test rule in the profile is associated with a dimension) rule weights are applied dimension scores. Weights and scores are displayed in a tree format, and the Data Quality Index (DQI) is the average of all dimension scores. For information, see Managing Rules from Rules Tab and How Data Quality Dimension Scores are Calculated.

If dimensions are not configured, rule weights are not applicable. Results are displayed in a flat grid structure, and the DQI score is simply the percentage of records that passed all test rules.

The Results Summary pane (on the left) provides summary metrics (in the first few rows) and a list of test rule execution results. The pane to the right displays pass/fail results in a graph. Dimensions are also displayed if at least one rule is associated with a dimension. By default, the Counts Total value is shown.

Select any row to view the results in the graph. Click a pass/fail portion of the graph to view records in the Drill Down Data tab (at the bottom of the screen).

The following information is provided:

Option	Description
Number of Dimensions	The number of configured dimensions in the profile (dimension with an associated rule). If no rules are associated with a dimension this value is zero (0). For information, see Managing Rules from Rules Tab.
Rules	The number of rules executed (count includes rules associated and not associated with a dimension).
Data Quality Index	The Data Quality Index (DQI) score for all executed test rules. If dimensions are configured, the score is equal to the average of all dimension scores (rule weights are figured into the score). If dimensions are not configured, this value is the percentage of records that passed rules (rule weights are not figured into the score). Rules that reside in -No Dimension- are not reflected in this value. For information, see Managing Rules from Rules Tab.
Field/Dimension	Displays the following values:
	*Counts – Total* - The aggregate pass/fail count and percentages for all executed rules. Overall values are shown in a pie chart in the right pane. This value is displayed under Aggregate Result if any dimension is configured. Rules in -No Dimension- are reflected in this value.
	*Rule – BarCharts* - The aggregate pass/fail percentages for all executed rules. Values for each rule are shown in a bar graph in the right pane. This value is displayed under Aggregate Result if any dimension is configured. Rules in -No Dimension- are reflected in this value.
	All Dimension results - The aggregate pass/fail percentages for all dimensions. Values for each dimension are shown in a bar graph in the right pane. This value is displayed under Aggregate Result if any dimension is configured. Rules in -No Dimension- are not reflected in this value.
	If any dimension is configured, the names of the six dimensions are displayed as folders in a tree format. Each dimension represents a characteristic of quality data. Expand a dimension to view the rule(s) associated with it and the dimension score (as well as pass/fail results, and weight(s) for associate rules). Dimensions are: Accuracy - The data is correct. Completeness - The data is present. Consistency - The data uses the same format or pattern across different sources. Timeliness - The data is recent and available. Uniqueness - The data is not duplicated. Validity - The data conforms to business rules and is within an acceptable range.
	-No Dimension- - Expand the folder to view list of all rules not associated with a dimension and their execution results. Execution results for rules listed here are not reflected in the DQI score.
Rule	The name of the rule which follows the <FieldName>_<RuleType> format. For example, City_IsNotBlank.
Description	The percentage pass/fail results for a rule, or for an Aggregate Result such as *Counts – Total*.
Dimension Score/Weight	If dimensions are not configured (that is, no test rules in the profile are associated with a dimension), these values are not displayed. If any dimension is configured, the following are displayed: Dimension Score: Indicates the quality level of data in the dimension. The score is derived by aggregating individual pass/fail results and the weights of rules in the dimension). Scores are 1-100, where 100 is highest quality. Weight: The user-assigned importance of the rule. Values are 1-5, where 5 is the most important. The default is 1. Rule weights are applied when at least one dimension is configured. The weight of a rule is included in the Dimension Score calculation for the dimension the rule is associated with. If no weights are assigned, all rules are equally weighted in the final scores (using the default weight of 1). When no dimensions are configured rule weights are ignored. For details, see How Data Quality Dimension Scores are Calculated.

Viewing Pass, Fail, and Drill Down Output

To view pass, fail, or drill down data output:

1. Run the profile and rule results and a graph of the selected rule is displayed on the Statistics tab.

2. In the Project Navigator view, expand the project folder and double-click the following files to view the output data in the editor.

• Fail.txt - FAIL_TARGET output displays data for the records that did not meet the specified criteria within the applied rule.

• Pass.txt - PASS_TARGET output displays output data for the records that met the specified criteria in the applied rule.

• Stats.json - STATS_TARGET output displays a graphical representation of the rule from the profile run that is automatically generated. However, it is not suggested to integrate this file into automated business processes.

• DrillDown.txt - DRILLDOWN_TARGET displays output data for the records that does not meet the criteria specified in the applied rule. This file is primarily used by Data Profiler to enable drill down reporting. However, you can use this file to browse in the editor to illustrate the relationship between the FAIL_TARGET records and the corresponding rule.

Note: The DRILLDOWN_TARGET uses the Boolean data representation or date/time formats from the source fields. So if the source has Y/N for Boolean field, then DRILLDOWN_TARGET will also contain Y/N and if the source has True/False then DRILLDOWN_TARGET will contain True/False. The only exception is for DMS connectors (Netsuite/ServiceNow/Oracle CRM). For these connectors, DRILLDOWN_TARGET will always use, True/False for Boolean, and ISO-8601 format for date. For example, 2022-05-16T08:24:16+00:00.

IMPORTANT!
When source data contains fields with non-String/Text data types, and the pass/fail target connector does not support matching data types, the system will default to converting the field to a String/Text type. This may lead to data truncation if the original data exceeds the default String length. The converted String will adopt the source field’s length, and you cannot modify this length in the target schema directly.

Recommended Solution:
To avoid truncation and gain control over field size:
1. Create a derived field using a toText(field) expression to convert the original field to String/Text.
2. On the Rules tab, add a new target field.
3. Map this new field to the derived field.
4. You can adjust the size of this new field in the Rules tab schema to accommodate the data fully.

Example:
Source Field Type: DateTime
Target Connector: Does not support DateTime

Steps to convert safely:
1. Create a derived field of String type.
2. Add ExecuteExpression rule with expression toText(`timestamp_field`, "MM/dd/yyyy hh:mm:ss a").
3. Add the derived field to the output and set the desired size for the field so that data is not truncated.

Rule and Their Resulting Outputs

Some rule display their results in pie charts and some results are displayed in bar charts. However, Distinct Values or Duplicate Values do not create graphic output. There are two charts that represent overall results:

• Counts Total displays pass and fail counts in a pie chart graph.

• Rule-BarChart displays all pass and fail rule data in a single bar chart with each rule pass and fail represented in a separate bar.

• Each rule pass and fail data is shown in an individual pie chart.

The following table provides information about the outputs for each rule and rule function:

Rule Name	Graphic	Additional Output
Compare to Constant, Compare to Field	Pie Chart	None
Is Not Blank, Is Not Null	Pie Chart	None
Maximum, Minimum, Mode, Sum	None	Statistical values
Standard Deviation	None	Statistical values
Equal Range Binning, Most Frequent Values	Bar Chart	None
Distinct Values	None	Text file of distinct values
Duplicate Values	None	Text file of distinct values

Last modified date: 01/08/2026