Data Quality External Reporting¶
This page explains the Centralized DQ Monitor Scan Reporting feature, which generates a comprehensive, standardized report for every Data Quality (DQ) Monitor scan and appends the results to a designated external table. This mechanism centralizes DQ metrics from various sources and monitors, providing a unified, historical view of data quality performance.
Report Structure (External Destination Table Schema)¶
The external table is the central repository for all DQ scan results. It provides a standardized format that allows users to query and analyze DQ performance across all projects, data assets, and monitors.
| Column Name | Data Type | Description |
|---|---|---|
| project_id | Long |
Identifier for the project where the data asset resides. |
| data_asset_id | String |
Identifier for the specific data asset (e.g., table/stream) being monitored. |
| monitor_Id | Long | Unique identifier for the Monitor that performed the check. This is the primary key for tracking the rule execution. |
| scan_timestamp | Timestamp |
The exact time the DQ job completed execution. |
| total_records_failed | Integer |
Count of records that failed the specific Monitor's check. |
| total_records_scanned | Integer |
Total count of records processed by the job for this check. |
| record_id_attribute_name | String |
The name of the primary key/ID attribute used to uniquely identify records in the data asset. |
| record_id_sample | Array of Strings |
A sample of up to 100 failed record IDs. This sample aids in immediate investigation and debugging. |
User Actionability¶
- To check DQ status: Users should query the External Destination Table, filtering by
data_asset_idandscan_timestamp. - To debug failures: Users can use the
record_id_samplealong with therecord_id_attribute_nameto look up the failing records directly in the source data asset for diagnosis.
Configuring Reporting¶
Reporting can only be configured via APIs. Please refer to DQ Reporting APIs.