Skip to content

Data Quality External Reporting

This page explains the Centralized DQ Monitor Scan Reporting feature, which generates a comprehensive, standardized report for every Data Quality (DQ) Monitor scan and appends the results to a designated external table. This mechanism centralizes DQ metrics from various sources and monitors, providing a unified, historical view of data quality performance.


Report Structure (External Destination Table Schema)

The external table is the central repository for all DQ scan results. It provides a standardized format that allows users to query and analyze DQ performance across all projects, data assets, and monitors.

Column Name Data Type Description
project_id Long Identifier for the project where the data asset resides.
data_asset_id String Identifier for the specific data asset (e.g., table/stream) being monitored.
monitor_Id Long Unique identifier for the Monitor that performed the check. This is the primary key for tracking the rule execution.
scan_timestamp Timestamp The exact time the DQ job completed execution.
total_records_failed Integer Count of records that failed the specific Monitor's check.
total_records_scanned Integer Total count of records processed by the job for this check.
record_id_attribute_name String The name of the primary key/ID attribute used to uniquely identify records in the data asset.
record_id_sample Array of Strings A sample of up to 100 failed record IDs. This sample aids in immediate investigation and debugging.

User Actionability

  • To check DQ status: Users should query the External Destination Table, filtering by data_asset_id and scan_timestamp.
  • To debug failures: Users can use the record_id_sample along with the record_id_attribute_name to look up the failing records directly in the source data asset for diagnosis.

Configuring Reporting

Reporting can only be configured via APIs. Please refer to DQ Reporting APIs.