- John Deck, University of California at Berkeley
- Ramona Walls, iPlant Initiative, University of Arizona
History and context
This group is a forum for discussion of standards crossing the biodiversity and genomics fields of study, with core emphasis on mobilizing usable products and technologies for applied research projects. TDWG has historically focused on the sharing of specimen-based data through a set of standards including Darwin Core (DwC), Access to Biological Collection Data (ABCD), and TDWG Access Protocol for Information Retrieval (TAPIR). However, the last several years have seen a swell of interest in including DNA-based information with specimen data and coupling sequence data with specimens and their associated environmental and taxonomic context. For instance, the Consortium for the Barcode of Life (CBOL) has been promoting the acquisition of well curated specimens, photos, and sequence data to define a “barcode” record. The Moorea Biocode Project has been sequencing and accessioning specimens for an entire island ecosystem. Meanwhile, the Genomic Standards Consortium (GSC) has a fast-growing Biodiversity Working Group (GBWG) focusing on building appropriate linkages between the genomics and biodiversity communities. Finally, active efforts are the DNA Extension for Access to Biological Collections Data (ABCDDNA), and work of the Data Standards and Access Task Force for the Global Genome Biodiversity Network (GGBN).
Given that specimen and genomics systems (e.g. Laboratory Information Management Systems (LIMS), sequence repositories) have been poorly integrated in the past and in fact, have largely different cultures, we must develop systems (semantic, application programming interfaces (APIs), and workflows) for integrating these federated systems.
This interest group will be facilitating communication of the development of tools, vocabularies, and standards of biodiversity genomics to the broader community of TDWG, GSC, and Global Biodiversity Information Facility (GBIF) users. This group will also be sharing and linking with people involved with the development of ontologies such as Biological Collections Ontology (BCO), Environment Ontology (EnvO) and the Gene Ontology (GO) and finding ways to promote better integration with Darwin Core (DwC), Minimum Information About Any Sequence (MIxS), and other TDWG/GSC standards.
This interest group overlaps the genomic biodiversity working group (GBWG), formerly called the GSC biodiversity working group. We operate the TDWG GBWG and the GSC GBWG independently in terms of process even though the groups share their name, members, and collaboration site.
This group welcomes participation from interested parties with backgrounds in biodiversity, genetics, technical architecture, and taxonomy. We propose the organization point for this group to be the TDWG website. Prospective members should refer to the email of the conveners for more information.
The benefit of inclusion in this group is to be informed of, influence, and promote new technologies and standards having to do with genomics and biodiversity and the intersection between the two. Members will explore new avenues for research for both biologists and informaticians, and garner the opportunities of working directly with a globally diverse set of participants.
Biodiversity genomics is a fast-growing field of study that describes biological variation in all its dimensions from the foundational DNA layer to organisms and ecosystems, phylogeny and function. Much of the data collected in such efforts currently has no consistent vocabulary implementation, standards representation, or implementation for dissemination and integration in the public domain. This group will facilitate discussion of use cases, form task groups to engage specific deliverables, and communicate relevant advances in biodiversity genomics technologies, vocabularies, and standards to the group members.
Areas that this interest group will be looking at:
- Standards integration efforts (e.g. DwC /MIxS integration)
- Relevant hackathons for data integration and concept reconciliation
- Identifying use cases and suitable technologies or gaps for solving them
- Genomic Standards Consortium http://gensc.org/) and GSC/Genomics Biodiversity Working Group
- RCN4GSC (link from http://wiki.gensc.org/index.php?title=RCN4GSC)
- BiSciCol project use cases (http://biscicol.blogspot.com/)
- Genomic Observatories: Taking the genomic pulse of the planet (http://www.genomicobservatories.org/)
- A Genomic Encyclopedia of Bacteria and Archaea (GEBA) (http://jgi.doe.gov/programs/GEBA/)
- The Earth Microbiome Project (http://www.earthmicrobiome.org/)
- Biocode Commons: Tools for supporting genomic observations from collections through analysis (http://biocodecommons.org/)
- Moorea Biocode Project (http://biocode.berkeley.edu/)
- Consortium for the Barcode of Life (http://www.barcoding.si.edu/)
- Biological Collections Ontology (https://github.com/tucotuco/bco)