Monitoring Data Quality Metrics

The ibi Data Quality WatchDog application provides users with meaningful insights on their Data Quality (DQ) metrics. Watchdog users can create new panels and dashboards with pre-configured metrics and KPIs, setup notification channels, and create alerts for monitoring metrics against user defined thresholds.

To learn more about creating reports, dashboards, and notification channels on Grafana, see the following website:

https://grafana.com/docs/grafana/latest/whatsnew/whats-new-in-v7-5/

DQ Metrics Views

This section describes all of the DQ metric views that are accessible through the Watchdog app.

Profile Summary

View Name

watchdog_pstats_summ

Description

This view contains the summary stats and metrics generated by the Profiler for an input data set. This view will have one row for each data set profiled in ibi Data Quality.

Column	Description
id	Unique identifier for the profile summary record.
dataset_id	Unique identifier for the input data set.
created_by	User account that executed the data profile.
created_date	Date when the data profile report is generated.
source_name	User assigned source name for the data set.
source_type	User assigned source type for the data set.
app_name	User assigned app name for the data set.
industry	User assigned industry associated with the data set.
entity	User assigned entity associated with the data set.
count_variables	Total number of data attributes in the data set.
count_observations	Total number of observations in the data set (row times column).
count_rows	Total number of rows in the data set.
count_dup_rows	Total number of duplicate rows in the data set.
pct_dup_rows	Percentage of duplicate rows in the data set.
count_dedup_rows	Total number of rows that were removed during data dedup.
pct_dedup_rows	Percentage of of rows that were removed during data dedup.
count_missing	Total number of observations that are empty, blank or nulls.
pct_missing	Percentage of observations that are empty, blank or nulls.
profile_score	Profile score of the data set calculated using the profiling stats and user’s expectations of the data.

Profile Details

View Name

watchdog_pstats_dtl

Description

This view contains the detailed stats and metrics generated by the Profiler for an input data set. This view will have one row for each data attribute in the data set that is profiled in ibi Data Quality.

Column	Description
id	Unique identifier for the profile detail record.
pstats_summ_id	Unique identifier for the profile summary record.
dataset_id	Unique identifier for the input data set.
created_by	User account that executed the data profile.
created_date	Date when the data profile report is generated.
source_name	User assigned source name for the data set.
source_type	User assigned source type for the data set.
app_name	User assigned app name for the data set.
industry	User assigned industry associated with the data set.
entity	User assigned entity associated with the data set.
variable_name	Name of the data attribute.
type	Data type of the attribute.
content_type	Data class of the attribute.
sensitive_flag	True if the data attribute is classified a Sensitive data, otherwise False.
expect_nulls	User assigned value set to True if values in the data attribute can be NULL.
expect_unique	User assigned value set to True if values in the data attribute is expected to be unique.
business_impact	User assigned value set to HIGH, MEDIUM, or LOW indicating business impact of the data attribute.
profile_score	Profile score of the data attribute calculated using the profiling stats and user’s expectations of the data.
count_total	Total number of observations for the data attribute.
count_unique	Total number of unique observations for the data attribute.
count_non_unique	Total number of non-unique observations for the data attribute.
count_distinct	Total number of distinct observations for the data attribute.
count_blanks	Total number of blank values or empty strings in the data attribute.
count_nulls	Total number of NULL values in the data attribute.
count_not_nulls	Total number of non-NULL values in the data attribute.
length_min	Minimum character length for the values in the data attribute.
length_max	Maximum character length for the values in the data attribute.
length_avg	Average character length for the values in the data attribute.
length_median	Median character length for the values in the data attribute.
most_common_pattern	Most common pattern for the values in the data attribute.
count_pattern	Total count of the values with the most common pattern in the data attribute.
pct_pattern	Percentage of the values with the most common pattern in the data attribute.
most_common_mask	Most common mask for the values in the data attribute.
count_mask	Total count of the values with the most common mask in the data attribute.
pct_mask	Percentage of the values with the most common mask in the data attribute.
num_avg	Average value calculated from all the numeric values in the data attribute.
num_min	Minimum value calculated from all the numeric values in the data attribute.
num_max	Maximum value calculated from all the numeric values in the data attribute.
num_std_dev	Standard deviation value calculated from all the numeric values in the data attribute.
num_pctl_25	25th percentile value calculated from all the numeric values in the data attribute.
num_pctl_50	50th percentile value calculated from all the numeric values in the data attribute.
num_pctl_75	75th percentile value calculated from all the numeric values in the data attribute.

DQ Summary

View Name

watchdog_dqstats_summ

Description

This view contains the summary stats and metrics generated by executing Rules against an input data set. This view will have one row for each set of Rules analyzed against an input data set in ibi Data Quality.

Column	Description
dq_summ_id	Unique identifier for the data quality summary record.
input_name	Unique identifier for the input data set.
created_by	User account that executed the Rules against the input data set.
created_date	Date when the data quality report is generated.
source_name	User assigned source name for the data set.
source_type	User assigned source type for the data set.
app_name	User assigned app name for the data set.
industry	User assigned industry associated with the data set.
entity	User assigned entity associated with the data set.
count_processed	Total number of observations processed.
count_invalid	Total number of observations that failed validation and could not be fixed.
count_valid	Total number of observations that contain valid output values.
count_cleansed	Total number of observations that were fixed - standardized, normalized, or enriched, number of cleansed observations are also counted under number of valid observations.
count_missing	Total number of observations that are empty or missing.
pct_valid	Percentage of total observations that failed validation and could not be fixed.
pct_invalid	Percentage of total observations that contain valid output values.
pct_cleansed	Percentage of total observations that were fixed - standardized, normalized, or enriched, number of cleansed observations are also counted under number of valid observations.
pct_missing	Percentage of total observations that are empty or missing.
dq_score	Data quality score of the input data set calculated using the data quality stats and user’s expectations of the data.

DQ Details

View Name

watchdog_dqstats_dtl

Description

This view contains the detailed stats and metrics generated by executing Rules against an input data set. This view will have one row for each Rule executed against an input data set in ibi Data Quality.

Column	Description
dq_dtl_id	Unique identifier for the data quality analysis detail record.
dq_summ_id	Unique identifier for the data quality summary record.
variable_group_name	Name of the data attribute or group of data attributes analyzed by a Rule.
input_name	Name of the data set.
created_by	User account that executed the Rules against the input data set.
created_date	Date when the data quality report is generated.
source_name	User assigned source name for the data set.
source_type	User assigned source type for the data set.
app_name	User assigned app name for the data set.
industry	User assigned industry associated with the data set.
entity	User assigned entity associated with the data set.
rule_id	Unique identifier of the Rule.
count_processed	Total number of observations processed.
count_invalid	Total number of observations that failed validation and could not be fixed.
count_valid	Total number of observations that contain valid output values.
count_cleansed	Total number of observations that were fixed - standardized, normalized, or enriched, number of cleansed observations are also counted under number of valid observations.
count_missing	Total number of observations that are empty or missing.
dq_score	Data quality score of the data attribute or group of data attributes calculated using the data quality stats and user’s expectations of the data.

DQ Transaction Summary

View Name

watchdog_dqstats_summ

Description

Column	Description
dq_summ_id	Unique identifier for the data quality summary record.
input_name	Unique identifier for the input data set.
created_by	User account that executed the Rules against the input data set.
created_date	Date when the data quality report is generated.
source_name	User assigned source name for the data set.
source_type	User assigned source type for the data set.
app_name	User assigned app name for the data set.
industry	User assigned industry associated with the data set.
entity	User assigned entity associated with the data set.
count_processed	Total number of observations processed.
count_invalid	Total number of observations that failed validation and could not be fixed.
count_valid	Total number of observations that contain valid output values.
count_cleansed	Total number of observations that were fixed - standardized, normalized, or enriched, number of cleansed observations are also counted under number of valid observations.
count_missing	Total number of observations that are empty or missing.
pct_valid	Percentage of total observations that failed validation and could not be fixed.
pct_invalid	Percentage of total observations that contain valid output values.
pct_cleansed	Percentage of total observations that were fixed - standardized, normalized, or enriched, number of cleansed observations are also counted under number of valid observations.
pct_missing	Percentage of total observations that are empty or missing.
dq_score	Data quality score of the input data set calculated using the data quality stats and user’s expectations of the data.

Did you find this helpful?

Yes No