• Content Type

Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 2: Data quality measures

Last updated: 7 Jan 2025

Development Stage

Pre-draft

Draft

Published

Scope

This document provides a data quality model, data quality measures, and guidance on reporting data quality in the context of analytics and machine learning (ML). This document builds on ISO 8000 series, ISO/IEC 25012 and ISO/IEC 25024.

The aim of this document is to enable organizations to achieve their data quality objectives and is applicable to all types of organizations. ©ISO/IEC 2022. All rights reserved.

Purpose

Data quality is necessary for ML and Big Data systems to be safe, reliable, and interoperable. ISO/IEC 20547-3 states that “Data quality management is essential to big data systems, as poor data quality such as incomplete, false or outdated data can disable effective data mining processes, prevent useful findings or lead to wrong output”. Moreover, ISO/IEC TR 24028 addresses challenges of data quality in AI systems based on machine learning, including bias in the data used to train the AI system, data poisoning, and adversarial attacks. Organizations can use this document to select and implement the data quality measures that meet their requirements in data analytics and ML.

This standard defines data quality characteristics for analytics and ML upon data, especially big data, quality measurements related to the data quality characteristics, and guidelines to evaluate and report data qualities in the data workflow of analytics and ML. ©ISO/IEC 2022. All rights reserved.

[site_reviews_summary assigned_posts=”post_id” hide=”bars,if_empty” text=”{rating} out of {max} stars ({num} reviews)”]

Let the community know

Categorisation

Domain: Horizontal

Key Information

Committee: ISO/IEC JTC 1/SC 42
Relevant UK committee: ART/1

Discussion

[check_original_title]