CCQM Task Group on Data Digitalization (CCQM-TG-DD)
Présidence
National Institute of Standards and Technology
États-Unis d'Amérique
Terms of reference
Purpose
The CCQM Data Digitalization Task Group (CCQM-DDTG) is established to support the implementation of digital and FAIR (Findable, Accessible, Interoperable, and Reusable) principles in chemical and biological reference measurement systems data. This task group will address challenges in unique identifiers, the digitalization of Certified Reference Material (CRM) certificates, and best practices in database development.
Scope
The Group will focus on tasks in the following key areas:
- Unique Identifiers for Chem/Bio Data:
- Provide recommendations on the use of unique interoperable identifiers for chemical and biological substances, including measurands, matrices, and sample types that can be applied to the KCDB and the JCTLM DB, considering broader applications;
- Leverage existing standards (e.g., InChI, NPU terminology) and ontologies to ensure interoperability across databases and consistency with the SI Digital Framework.
- Digitalization of CRM Certificates:
- Understand and document stakeholder needs for digital CRM certificates amongst various communities /sectors;
- Liaise with CIPM FORUM-MD ad hoc Task Group on Harmonizing DCC and DRMC (FORUM-MD-TG-H-DCC/DRMC) contributing to:
- The creation of guidelines for developing and maintaining digital CRM certificates adhering to FAIR principles.
- An understanding on resource requirements, data security, and long-term maintenance of digital certificates in the guidelines.An understanding of the approaches for the audit, validation, and modification of digital certificates that are in line with ISO standards.
- FAIR Principles in Chem/Bio Databases and incorporation of AI:
- Test approaches that enable Chem/Bio reference data to be accessible to and accurately interpreted by Large Language Model (LLM) systems, starting with the JCTLM DB and selected test data sets from the BIPM KCDB;
- Evaluate methods for enabling AI to: analyze measurement comparisons, reporting, and certification outputs; convert selected data into structured, database-compatible formats; develop AI agents to automate the population of relevant databases with the converted data; and assess data conformity with specified technical or numerical criteria, initially using the JCTLM database and a test dataset from the BIPM KCDB;
- Provide guidelines for the application of the FAIR principles to Chem/Bio data products.
Stakeholder Engagement
In order to achieve its tasks the group will need to engage relevant stakeholders in its work, and thereby:
- Facilitate collaboration with National Metrology Institutes (NMIs), international organizations (e.g., IUPAC, ISO, IFCC), and industry stakeholders to address user needs and feedback.
- Encourage engagement with instrument vendors and regulatory bodies to support the adoption of open and standardized data formats.
Structure and Membership
The Task Group will create teams to progress each of its tasks, with expert members that are required to complete the tasks to be drawn from:
- CCQM and JCTLM WGs with relevant experience and requirements.
- Representatives from NMIs active in chemical and biological metrology and digitalization initiatives.
- Experts in digitalization, data management, LLM systems,Machine Learning/AI applications, and database development.
- Stakeholders from relevant sectors and international organizations to ensure diverse expertise and representation and likelihood of take up of outputs.
Reporting
The Task Group will:
- Provide regular updates to the CCQM-SPWG, CCQM and FORUM-DI on its progress.
- Submit a comprehensive report of recommendations and outcomes at the end of its term or as requested by the CCQM President.
Duration
The CCQM-DDTG will operate for an initial term of three years, with a review of its progress and continuation at the end of the term.
Deliverables
- A system for unique identifiers for chemical and biological reference data that can be applied in the KCDB and JCTLM DB and potentially for broader application.
- A report of stakeholder expectations and needs for digital CRM certificates across various sectors and applications.
- A completed case study of a Chem/Bio reference database that is fully accessible, and accurately interpretable by large language model (LLM) systems. The case study will also explore the feasibility of developing AI agents capable of autonomously populating and updating the databases based on validated measurement data and structured metadata inputs, with the goal of supporting dynamic and scalable knowledge integration.
- A finalized case study demonstrating statements of performance for reference methods or capabilities, generated through accurate AI-based interpretation of measurement data, interlaboratory comparison results, and certification reports, and subsequently assessed for conformity with specified requirements.