• Content Type

Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 3: Data quality management requirements and guidelines

Last updated: 7 Jan 2025

Development Stage

Pre-draft

Draft

Published

Scope

This document specifies requirements and provides guidance for establishing, implementing, maintaining and continually improving the quality for data used in the areas of analytics and machine learning.

This document does not define a detailed process, methods or metrics. Rather it defines the requirements and guidance for a quality management process along with a reference process and methods that can be tailored to meet the requirements in this document.

The requirements and recommendations set out in this document are generic and are intended to be applicable to all organizations, regardless of type, size or nature. ©ISO/IEC 2022. All rights reserved.

Purpose

Machine learning is based on the fact that the intended function is no longer explicitly described in a rule-based approach, but is learned through data-driven training. The achievable quality of the resulting function depends heavily on the content of the selected data and the quality of the content. Thus, the quality of the data has a direct influence on the quality in terms of performance and robustness of the intended function.

Even the evaluation of performance and robustness of the function is usually carried out on data. Such an evaluation can only be considered valid, if the data on which the evaluation was made are of sufficient accuracy, representativity and quality.

Therefore, it is also strongly advised to ensure the quality of the data used for the training or evaluation of functions, built by using machine learning.

This standard addresses the management regarding organizational and project specific tasks to cover all necessary aspects to achieve a desired level of data quality over the complete lifecycle (from specification, acquisition, pre-processing, labelling, composition) for the provisioning and usage of data for analytics and machine learning.

This document will permit conformity assessment (e.g. certification) for data quality for analytics and ML.

The standard will not define a detailed process. It will define the requirements and recommendation for a process used for the provisioning of data intended to be used for machine learning (ML data). It may propose a reference process that can be tailored as long as the objectives (which will be defined with this standard) can still be achieved.

The standard will not fix to specific methods or metrics. It will define necessary aspects which the methods should fulfill and may provide examples for suitable methods and metrics. ©ISO/IEC 2022. All rights reserved.

[site_reviews_summary assigned_posts=”post_id” hide=”bars,if_empty” text=”{rating} out of {max} stars ({num} reviews)”]

Let the community know

Categorisation

Domain: Horizontal

Key Information

Committee: ISO/IEC JTC 1/SC 42
Relevant UK committee: ART/1

Discussion

[check_original_title]