Image Quality and AI-Readiness (IQAIR)

A task group of the Audiovisual Core maintenance group

Convenor

Yasin Bakış, Tulane University

Core Members

  • Steven Baskauf
  • Henry Bart
  • Xiaojun Wang
  • Bahadır Altıntaş
  • Dom Jebbia
  • Jane Greenberg
  • David Breen
  • Anuj Karpatne
  • Nicolas Bailly
  • Leanna Housel

Motivation

The rapid growth of biodiversity multimedia data and the increasing use of artificial intelligence and machine learning (AI/ML) in biodiversity research highlight a critical need for standardized, interoperable metadata describing image quality, provenance, and AI-readiness of audiovisual resources across all taxa.

Currently, biodiversity researchers, data publishers, and AI practitioners lack shared, community-reviewed mechanisms to assess whether audiovisual media are fit for purpose for specific scientific, analytical, or computational use cases. This gap leads to inefficiencies in data reuse, reproducibility challenges, and barriers to cross-domain synthesis.

This Task Group is established to evaluate, harmonize, and propose metadata concepts related to image quality and AI-readiness for audiovisual biodiversity data, with the goal of supporting consistent assessment, discovery, and reuse of multimedia resources within the TDWG standards ecosystem.

Scope

The scope of this Task Group encompasses audiovisual biodiversity media across all taxonomic groups. While fish-based datasets and workflows (e.g., FishAIR) provide important reference implementations and case studies, they do not define or constrain the scope of the group.

The Task Group will consider:

  • Still images and other relevant audiovisual media
  • Taxon-agnostic metadata concepts
  • Cross-domain interoperability with biodiversity, informatics, and AI communities

Goals Outputs and Outcomes

The goals of this Task Group are to:

  1. Review existing metadata concepts related to image quality, AI-readiness, and automated feature extraction used in biodiversity and related domains.
  2. Evaluate candidate terms for relevance, clarity, generality, and compatibility with TDWG standards, particularly the Audiovisual Core.
  3. Harmonize overlapping concepts and identify gaps where new metadata terms may be needed.
  4. Propose recommendations for additions, refinements, or extensions to TDWG standards, following TDWG’s established review and approval processes.
  5. Document use cases demonstrating how image quality and AI-readiness metadata support scientific reuse, reproducibility, and AI-enabled discovery.

Expected outcomes include:

  • Community-reviewed recommendations for metadata terms
  • Clear documentation of intended meaning and use
  • Alignment with TDWG governance and standards processes
  • Improved guidance for publishers and users of audiovisual biodiversity data

Sources of Candidate Terms

Initial candidate concepts will be drawn from:

  • The FishAIR vocabulary and related research outputs
  • Metadata generated through automated and semi-automated workflows in biodiversity imaging
  • Existing TDWG standards and extensions
  • Relevant efforts in allied initiatives (e.g., DiSSCo and other international infrastructures)

All candidate terms will be treated as inputs for discussion, not as pre-approved outcomes.

Timeline

Year 1

  • Review and categorize existing image quality and AI-readiness concepts
  • Identify overlaps with existing TDWG terms
  • Solicit community feedback and participation

Year 2

  • Refine candidate terms and definitions
  • Develop use cases across taxa and domains
  • Draft preliminary recommendations

Year 3

  • Finalize recommendations for submission through TDWG processes
  • Document outcomes and guidance for implementation
  • Support transition of accepted terms into TDWG standards infrastructure

Operating Model

  • The Task Group will meet approximately once per month.
  • Discussions and documentation will be conducted openly.
  • Progress and materials will be publicly accessible.

Open Participation

This Task Group follows the TDWG model for open participation.

  • Participation is open to all interested members of the TDWG community and beyond.
  • An open call for participation will be issued.
  • Membership is not limited to contributors from any single project or funding source.
  • Community input is actively encouraged throughout the Task Group’s activities.

Strategy

The Task Group will:

  1. Evaluate candidate metadata terms in a transparent, consensus-driven manner
  2. Align recommendations with TDWG architectural principles
  3. Ensure interoperability with existing standards
  4. Advance proposals through TDWG’s formal review and approval pathways

Becoming Involved

People who want to attend meetings can contact the convenor to be added to the email list. People who want to follow the progress of the group can watch the GitHub repository.

History and Context

Relevant background work includes, but is not limited to:

  • FishAIR – A fish image dataset with image quality management and AI-readiness workflows
  • Bakış et al. (2023) On Image Quality Metadata, FAIR in ML, AI-Readiness and Reproducibility
  • Karnani et al. (2022) Computational metadata generation methods for biological specimen image collections
  • Jebbia et al. (2022) Toward a Flexible Metadata Pipeline for Specimen Images
  • Mehrab et al. (2024) Fish-Vista: A Multi-Purpose Dataset for Trait Understanding

These works provide context and exemplars, not binding standards.

Summary

This Task Group aims to strengthen TDWG’s Audiovisual Core by advancing community-reviewed approaches to image quality and AI-readiness metadata that support reproducible, scalable, and equitable reuse of biodiversity audiovisual data across taxa and disciplines.

Resources

A public GitHub repository will be established under the TDWG organization to support transparency, documentation, and community engagement.