Working Sessions

This page shows the status of the programme for the working group/hackathons parallel sessions. It is a work in progress by the Programme Committee for TDWG 2009.  Comments and feedback can be added via the previous link.

Working Sessions

Theme #1: e-Biosphere Follow-up

Item Leader Description
Roundtable: potential events for 2010 - the Year of Biodiversity Lee Belbin It would seem a pity if the international group developing standards for sharing biodiversity data didn't do something special in the Year of Biodiversity. Wouldn't it? If you agree, what few things should TDWG (you) be doing in 2010 to save the planet? Please bring along ideas to discuss and decide.
Citizen Science Joel Sachs This will be an informal session. We'll start with a short presentation cobbled together from slides contributed by people unable to attend, and then will craft a discussion agenda for the remaining time. Discussion topics can be suggested in advance, or during the session. Things I'd like to address include: i) making citizen science data standards compliant, according to appropriate definitions of "standards" and "compliance", and ii) using twitter vs. developing custom mobile apps. Other possible topics include grass-roots activities; integrating activities; tool development; use of Web 2.0; and visions for the future.

Theme #2: Agricultural Biodiversity

Item Leader Description
Herbarium digitization Eric Chenin, Pascal Chesselet Herbarium collections have a special place in Natural History collections, as plants have in ecosystems, and herbarium specimens have specific technical characteristics that facilitate their digitization. Digitization is here understood as including both label data capture and sheet image scans: both forms of information are useful and complement each other. The session will consist of approximately 4-5 brief presentations from herbarium digitization projects, and and about tools and methodologies. Discussions will address a number of practical issues such as appropriate collections management software (e.g. RIHA, KE Software), the role of digitization in specific herbarium management issues, standards (or the lack of) for specimen imaging, standards for person’s names (authors of plant names, plant collectors and determiners of plant names) and the constraints and possibilities of georeferencing and mapping label data. To complement and speed up routine (e.g. loans-driven) and project-driven (e.g. type specimens) herbarium digitization, accelerated metadata capture will be discussed with the aim to establish and re-enforce cross-institutional priorities (potential outcome for herbarium digitization workshop). The workshop will explore the potential for metadata to foster the use of herbarium collections in providing answers to scientific and operational questions, and attract funds to enhance and accelerate unit level digitization. Discussions will also address the specific issues of developing countries. [expect 20 p; should not overlap with; Gregor Hagedorn - Biological Descriptions Interest Group]
DarwinCore? Germplasm Extension and its deployment in the GBIF Integrated Publishing Toolkit Dag Terje Filip Endresen, Samy Gaiji, Tim Robertson DarwinCore? is designed around a set of general terms (the core) applicable for most unit-level biodiversity datasets. DarwinCore? also implements extensions to the core terms, designed to include terms of more specific utility in particular thematic domains – such as for example the community for plant genetic resources. The DarwinCore? Germplasm Extension has been developed to include the additional terms required to describe germplasm samples maintained by genebanks worldwide. This working session will focus on the further development of the Darwin Core Germplasm Extension. We will also cover the building of any extension and the implementation of new extensions in the GBIF Integrated Publishing Toolkit (IPT), depending on the interests expressed by the working session participants.
Species Profile Model (SPM) III: Visual and textual standards for taxonomic identification Pierre Grard, Pierre Bonnet Species identification tools using morphological, geographical and ecological characters have made many advances since few years. Increasing use of molecular and non-morphological characters require new combinations of these different approaches. Several attempts have been made to use visual and graphical representations of morphological identifications characters. This working session will focus on the interaction of the different ways (or tools) to identify plants, and complementarities between them.
Standards for plant traits (cultivated and wild) - expanding standards to include characterization and evaluation data, phenotypic descriptors Michael Mackay Phenotypic data adds a new level of complexity to the management of accession level data in ex situ genebanks of plant genetic resources for food and agriculture (PGRFA). It is recognized that standards for passport data (accession identity) need to be quite stringent to facilitate correct identification. However, in the case of phenotypic data, the standards need to be more flexible to allow the different legitimate methods scientists use to evaluate the same trait as well as having multi-site observations. This working session will introduce and explain the current structure used to manage phenotypic data in the global portal; a collaborative project between Bioversity International (Bioversity), the Global Crop Diversity Trust (GCDT) and the Secretariat of the International Treaty on Plant Genetic Resources for Food and Agriculture (Treaty). The development of the global portal is an iterative process that is addressing the need for a global information system referred to in Article 17 of the Treaty. In addition to introducing the current structure and design, the working session should identify some of the minimum method and location metadata descriptors required to support flexible phenotypic data. The current prototype global portal contains 1.2 million accession records for the 22 ITPGRFA Annex 1 crops in addition to some 3 million phenotypic data records, so it provides a substantial background on which to further develop standards for publishing phenotypic PGRFA data.
Data integration enabling an eco system approach for the management of Genetic resources - integrating multilayers databases - Management of Animal, Aquatic, microorganisms and plant genetic resources Nicolas Bailly (TBC) Integrating multilayers databases - Management of Animal, Aquatic, microorganisms and plant genetic resources
Crop trait Ontology Rosemary Shrestha, Elizabeth Arnaud The Crop ontology is a recent project launched by CGIAR centers attempting to model the knowledge needed for the description of a crop trait in a given observation environment, for specific characters and measurements. It is a necessity to collaborate with related ontologies like ENVO, Plant Ontology, Trait Ontology (Cornell University), the Crop Wild relative ontology, the FAO agriculture ontology. Ontology is used in applications for Biocuration of databases- Presentations of Crop ontology, Ontology Look up service, Envo. The Environment Ontology and Terminizer (dept. Of Computer Science in University Of Manchester) tool assisting in the detection of ontological terms found in free text . Will be discussed: The challenges of developing a Crop trait ontology and motivate communities of practice to apply (Breeders, collection curators, etc) - the patterns of collaboration between the ontologies relating to the concerned knowledge domain - the application of the Ontology in databases and data entry wizard and the use of tools for a dynamic Ontology curation and development - Participants will actively share their experience and suggestions.
Biocuration in Agricultural related databases - Standards and tools for Data quality process Elizabeth Arnaud Biocuration : the newly founded International Society for Biocuration (ISB) defines it as follows:’ Biocuration involves the translation and integration of information relevant to biology into a database or resource that enables integration of the scientific literature as well as large data sets. Accurate and comprehensive representation of biological knowledge, as well as easy access to this data for working scientists and a basis for computational analysis, are primary goals of biocuration. The goals of biocuration are achieved thanks to the convergent endeavors of biocurators, software developers and researchers in bioinformatics. Biocurators provide essential resources to the biological community such that databases have become an integral part of the tools researchers use on a daily basis for their work. ‘ Ontology is a fundamental tool for Biocurators to annotate data and tag literature in order to link these pieces of information and enrich the data sets. Biocuration notion was generated from the public molecular databases. What curation processes are then possible for agrodatabases? Participants will share their experience about the curation of databases and others.
Agricultural biodiversity informatics: Research infrastructure - Development of a research infrastructure for agricultural biodiversity informatics Éamonn Ó Tuama CANCELLED
Wildlife disease and veterinary informatics: Integrating Animal Health and Biodiversity Informatics Standards Josh Dein, James Case, Jeff Wilcke The purpose of this session is to bring together individuals engaged in data standards development for veterinary medicine with those who are interested in the inclusion of disease related information in biodiversity databases. We will begin with two broad overview presentations, one each in medical and biodiversity informatics, to introduce the basic concepts and tools in each area to those in attendance who are unfamiliar with the other discipline. They will set the stage for a working group session for more detailed discussions on the likely mechanisms for integration of standards and systems. Hopefully, this will lead to longer term collaborations through one or more of the existing TDWG Interest Groups. [Must take place by Wed. PM at the latest.]
Species-related databases, information systems and inventories of cultivated and useful plants Michel Chauvet Many attempts have been made in the past to compile online and printed inventories of useful and/or cultivated plant species. They differ in thematic and geographical scope, and in database structure. In an era when sustainable development is high on the political agenda, a comprehensive system of information about plant resources of the world is badly needed. It should encompass agronomical as well as botanical data and cover all the kinds of uses. Such a challenge can be reached only by a collaborative effort, involving sharing tasks, implementing standards, promoting interoperability and exchanging data sets. An overview will be given of the various existing inventories and information sources dealing with useful and/or cultivated plant species. The working session is aimed at discussing the ways and means to improve the situation and to better meet the needs of users, and at completing the overview of existing databases and inventories.
Indigenous knowledge Doyle Mc Key Indigenous knowledge about biodiversity can make important contributions to its conservation. This is particularly so for domesticated plants and animals: people created their biodiversity and the dynamic management of traditional farmer continues to shape it. Ever since the 1992 Rio “Earth Summit”, there has been a mandate to value, to conserve and to use indigenous knowledge about biodiversity. This entails circulation and sharing of this knowledge. The global use of local knowledge presents epistemological, methodological and ethical challenges. Indigenous knowledge is part of a system that is logical in its local context. Can locally pertinent knowledge be transcribed and used in other contexts? Can local knowledge systems yield independent, transportable pieces of information? Indigenous knowledge can be viewed as intellectual property. What constitutes informed consent from its holders that it be circulated and shared? For whose use are databases of indigenous knowledge intended, and for what purposes? What are the effects of the process of obtaining, circulating and using this knowledge, on the local systems that have generated it?
(1) Standards for plant traits, (2) DarwinCore? germplasm extension (3) Crop trait ontology : joint discussion Theo Van Hintum Joint discussion about the outcomes of the three previous sessions: Crop trait ontology, standards for plant traits and DarwinCore? germplasm extension.. Participants of the 3 sessions are welcome to attend this session to share their feedback and views to draw suggestions and recommendations on standards for plant traits and the potential for creating an interest group in TDWG.

Theme #3: Data Integration

Item Owner Description
The TDWG ontology (Closed Session - by invitation: Planning) Donald Hobern Preparation for the open working session "Development of TDWG ontology"
Development of TDWG ontology (Open Session) Donald Hobern The goal of biodiversity informatics is to liberate information from all sources to support any activity dependent on understanding the world's biodiversity. The complexity of this task, encompassing significant heterogeneity both in information sources and in use cases, indicates the need for clear communication on the contents and origins of each data set. This communication depends on shared understanding of the subject domain and the use of consistent ways to model and present data. Over recent years TDWG has been working to develop a general ontology for biodiversity data incorporating domain knowledge from existing TDWG standards. We need to progress this work to provide a common shared model for biodiversity projects to exchange and integrate data. Such a model was identified at e-Biosphere as one of the major requirements within international biodiversity informatics. These sessions will explore the current state of the TDWG ontology, provide a forum for discussion of requirements and issues, and allow TDWG to progress this work towards its presentation as a draft standard.
The TDWG ontology (Closed Session - by invitation: Roadmap) Donald Hobern Follow-up of the open working session "Development of TDWG ontology"
GBIF LSID-GUID task Group Greg Riccardi, Éamonn Ó Tuama Outcomes of the GBIF LSID-GUID Task Group
Biological Descriptions Interest Group/Species Profile Model Cyndy Parr, Gregor Hagedorn Part 1: Summary descriptions about species: e.g. Species Profile Model.

SPM is being developed to enable sharing and integration of high level summaries about the natural history, life history, and other biology of organisms. Several speakers will be asked to share their experiences to date using the SPM on projects such as EOL and Plazi. Recommendations for changes to SPM will be made, and discussed, and an action plan for proposal as a standard will be developed.

Part 2: Identification keys, software, platforms, and future directions.

The workshop will start with brief presentations and project updates from various projects or organisations involved in the creation and management biological descriptions. The main part will be devoted to general discussions about future directions, where synergies between projects could be found, and what the future direction of the BDI group in general are.
Invasive Species Interest Group - Inserting/testing GISIN models in GBIF IPT Annie Simpson, Jim Graham, Michael Browne The Global Invasive Species Information Network (GISIN) provides a platform to share invasive species information via the Internet and other digital means. This working session will provide a brief overview of the GISIN's TAPIR-compliant system to cross search disparate invasive species information systems on the Web, review user needs, and break up into subgroups to perform some of the following tasks (depending upon participants' interests and skills): 1) brainstorm all the problems we think data providers will have and then talk about how to help them; 2) map the GISIN protocol elements to the GBIF Integrated publishing toolkit; 3) brainstorm about which of the available online information systems, based on their content and IT configuration, would be the highest priority to add as data providers; and 4) determine which users manuals are needed and begin to outline their content. The various results of the session will be published on the site at GISIN.org and on the TDWG invasive species Wiki.
Wiki publishing workshop Gregor Hagedorn Most web publishing and data management frameworks are centered on the presentation and management needs of big organizations. While there is a need for this, it doesn't reflect the traditional, peer-based publishing and recognition system in science. Most scientists are at the mercy of commercial publishers with little options to employ open content licenses like Creative Commons. Formats like taxon mini-reviews are particularly ill served in the conventional publishing industry.

"Web 2.0" is successful with peer-based, self-organizing systems with minimal hierarchy. A general prejudice is that this approach has little value as data. While largely true for the word processor-to-PDF workflow, this is not a necessity. The object-oriented wiki-approach offers free form as well as structured and reusable (e. g. in DBPedia and Open Data Linking projects) data. Semantic Media Wiki could even bring the benefits of the semantic web and rdf-based ontologies within the reach of many biologists.

The workshop will show examples of integrating free-form text with wiki-document based structured data for identification keys to show the potential of the wiki technology.
Phylogenetic Nomenclature and RegNum? Development Nico Cellinese, Torsten Eriksson, Kate Rachwal The PhyloCode? is a formal set of rules governing phylogenetic nomenclature. The Code is designed to name the parts of the tree of life by explicit reference to phylogeny. The emphasis of the workshop is to provide participants the opportunity to learn about the different types of phylogenetic definitions and how they are constructed. The PhyloCode? requires that phylogenetic definitions are registered in a public repository. RegNum? (http://regnum.ebc.uu.se/) is a web-based name registration database that serves as the repository of clade names and phylogenetic definitions. Current development in Ruby-on-Rails and integration with other resources such as TOLKIN (www.tolkin.org) and TreeBASE? (www.treebase.org) will be presented and discussed with the participants. This workshop is an activity of the TDWG Interest Group on Phylogenetics Standards (http://wiki.tdwg.org/Phylogenetics)
Name Matching Workshop - Discuss ways to achieve name matching, look at possible integration of these services Dave Remsen Two main areas of focus: 1. Matching authorship ; we have a tool that does it but haven't tied it to our new generation of name recognition tools; 2. Names discovery; finding novel names consisting of higher taxa, genera and epithets that are not in an existing lexicon
Roadmap for integrating and scaling geospatial biodiversity data Reed Beaman, Javier de la Torre The ability to integrate large scale geospatial data with a broad range of biodiversity information poses a current challenge to the informatics community. Multiple international efforts (some collaborative) are engaged in developing tools and resources for data access, management, analysis, and visualization. There is an ongoing need to maintain an international forum for discussion and a longer-term roadmap that integrates with efforts of e-Biosphere, GBIF, EOL, WCMC, GEO BON, and a host of national and regional efforts. Many of these are not focused on geospatial data, but on all aspects of biodiversity science, yet they share a common need for expertise and applications that handle the complexities of geospatial data integration. GBIF is hosting a "Strategic Applications" workshop in September 2009, in which Javier de la Torre is a participant, that presents several projects and initiates a discussion of geospatial data requirements. The TDWG conference provides an ideal venue for continuing and broadening participation in this discussion and establishing task groups that can collaborate on further developing use cases, ontologies, and implementing scalable geospatial tools, middleware, and resources that can benefit the biodiversity community.
Harnessing the long tail: small biodiversity data publishers Vishwas Chavan Discuss the tools, standards and processes to create hassle free environment for small publishers
Multimedia Resources Metadata Schema Vishwas Chavan, Robert Morris The Multimedia Resources Metadata schema ("MRTG schema") is a set of representation-neutral metadata vocabularies for describing biodiversity-related multimedia resources and collections. The MRTG standard is the culmination of work on multimedia resource descriptions carried out by participants from Key To Nature, the NBII Digital Image Library, MorphBank?, and others, together with input from a number of other stakeholder communities including Encyclopedia of Life (EOL), the Biodiversity Heritage Library (BHL) and UMASS-Boston. The Global Biodiversity Information Facility (GBIF) commissioned the 'Multimedia Resources Task Group (MRTG)' in March 2008 and the Group was approved in December 2009 by Biodiversity Information Standards (TDWG) as the 'Joint GBIF-TDWG Task Group on Multimedia Resources in Biodiversity'. The standard was developed by the Joint GBIF- TDWG Multimedia Resources Task Group to fit with the suite of data standards being developed on behalf of the Global Biodiversity Information Facility (GBIF) by Biodiversity Information Standards (TDWG). During this session we intend to discuss the schema, its usefulness and improvisation, before it is ready for formal ratification by TDWG.
Global Names Architecture Rich Pyle Almost all information related to biodiversity is, in one way or another, associated with a scientific name. For more than two and a half centuries, biologists have assigned formal scientific names to organisms as a way to facilitate communication. It is often suggested that scientific names are the "glue" that binds all biodiversity information. This session will include a description of the emerging "Global Names Architecture" (GNA), an effort to establish a common infrastructure for cross-linking biodiversity datasets through scientific names; as well as its two primary data components, the Global Names Index (GNI), and the Global Names Usage Bank (GNUB). The majority of this session will be reserved for open discussion about the scope and implementation of GNA, GNI and GNUB, and the services needed to put them to effective use for linking existing datasets via scientific names.
Linked Literature Chris Freeland Discussion & demonstration of how nomenclators can connect into a literature resource like Biodiversity Heritage Library.
Annotations of Biodiversity Records and Datasets James Macklin, Paul Morris, Robert Morris Annotations of data records serve a number of important functions. Annotations may signal opinions that records or datasets have been superseded by other data, that they represent logically or statistically inconsistencies either internally or with respect to other data, or simply that there are related data which consumers may find relevant to their use of the given data. Annotations thus form an additional kind of record-level metadata, in a form which can be provided by third parties, and which itself can be annotated, giving rise to what amounts to a digitized discussion of the primary data.

The session will consist of approximately 4-5 brief presentations from projects that already have in place demonstrable software that addresses any of the above purposes of annotations or any other purposes as may plausibly meet the informal notion of a digitized discussion of primary data. At the end of the session, the moderator will summarize common or otherwise important points, and will lead a discussion designed to lead to a proposal to form a TDWG Annotation Interest Group.
Infrastructure for storage and exchange Phil Cryer, Anthony Goddard Discuss hardware and software systems required for global, redundant storage to facilitate data exchange and integration.
Prioritizing digitalisation and adding value to collections using collection metadata Thierry Bourgoin Objectives of the working session are:
1) To publicise the need for collection metadata for prioritizing - and therefore optimizing - the digitalisation effort of collection specimens, and to stress on their importance in adding value to collections by facilitating accessibility to primary data.
2) To set up a metadata capture template, and
3) To discuss and select the various fields documenting these metadata while maintaining the template as simple as possible.
This session should take place in the frame of reflexion carried out by the GBIF Task Group on the Global Strategy and Action Plan for the Digitisation of Natural History Collections (GSAP-NHC) and will serve to investigate further this new concept, particularly how to implement it more concretely.
Four shorts communications of 10 minutes each are planned:
- Why collection digitalisation is important? Thierry Bourgoin
- The collection metadata concept. Walter Berendsohn
- How to proceed? James Macklin
- Setting up a template - link with the GBIF Global Biodiversity Resources Discovery System (GBRDS) and GBIF Metadata Catalogue. Vishwas Chavan
These will be followed by an open discussion about the different kinds of fields that will have to be tracked and which standards they should have to follow. The session should conclude with a series of concrete recommendations that will serve pilot projects already identified during the Leiden meeting of the Society for the Preservation of Natural History Collections, 6-11 July 2009, to start testing the concept in real length. The session would also address the question whether the ‘Natural Collections Descriptions’ or any other schema could be adopted or needs specific alterations.
Phylogenetics VoCamp Nico Cellinese, Karen Cranston, Hilmar Lapp, Sheldon MacKay, Enrico Pontelli, Arlin Stoltzfus Integrating diverse biological data with the historical process of evolution is a grand challenge for 21st century biology. A technology infrastructure that can achieve the necessary interoperability of data and software from diverse fields requires formalized, shared vocabularies as one of its key components. Developing such vocabularies and ontologies is a community project. To this end, the "Phyloinformatics VoCamp" aims to connect previously disparate ontology development efforts, stakeholder communities, and interoperability initiatives with shared objectives. Aside from sharing knowledge, expertise, and best practices, existing ontology resources will be extended in a hands-on manner to improve ontological rigor, semantic richness, and support for reuse. Some participants will also be programming proof-of-concept applications that directly apply the ontologies being developed. The VoCamp is part of the Phylogenetics Standards Interest Group activities, and is sponsored by the National Evolutionary Synthesis Center (NESCent: http://nescent.org), with additional support from TDWG and LIRMM (http://www.lirmm.fr/xml/fr/lirmm.html) (University of Montpellier).

EDIT

Item Leader Description
EDIT tutorial for programmers - CDM Library Andreas Kohlbecker The workshop addresses programmers interested in the technology of the EDIT Platform for Cybertaxonomy and how it can be deployed for building new or extending existing applications. In particular, the EDIT CDM-library as well as its associated web-service layer will be introduced.
EDIT tutorial for users (e.g. Taxonomic Editor, specimen search, Geo-tools) Pepe Ciardelli The EDIT User workshop gives a practical introduction to the different user tools and applications associated with the EDIT Platform for Cybertaxonomy ranging from setting up CDM stores and using the Taxonomic Editor to specific tools for searching specimen data as well as creating distribution maps.
EDIT Scratchpads tutorial Dave Roberts In this session we will provide a developer level overview of the Scratchpad project including information on hosting a Scratchpad server, extending the Scratchpad functionality through modules including integration with other projects. The session will take the form of an overview presentation, followed by a question and answer session on the following topics:

- Scratchpad server installation (source, RPM or DEB)
- Code repositories and module installation/updates (http://drupal.org, SVN, http://home.scratchpads.eu)
- Site installation profiles
- Taxonomy management
- Mirroring
- Future directions (ViBRANT?- Virtual Biodiversity Research and Access Network for Taxonomy)

Other topics

Item Leader Description
Literature Group: continue working on standard development Anna Weitzman During the 2009 Literature working session, interested delegates, will meet to work on four topics:

A. Literature citations standards
a. Finalize and approve existing draft standards.
b. Decide on a process to move them to RDF (this needs someone who knows RDF).
c. Decide on a timeline and responsibilities for entering them into the formal TDWG Standards Process.

B. Taxonomic literature content standard (a proposed standard for delivery of taxonomic literature -- retrospective and prospective).
a. Final decision on a model for standard (i.e., an extension to the NLM DTD or a free standing standard in XML and/or RDF; is RDF even relevant for this kind of standard).
b. Draft standard during meeting.? Set timeline and responsibilities for future work.

C. Vocabulary standards for disambiguation of components of taxonomic literature
a. Which and how to update and deliver (electronically) existing (prior) TDWG standards (Authors of Plant Names; Botanico-periodicum-huntianum and Botanico-periodicum-huntianum Supplementum; Index Herbariorum; and Taxonomic Literature, ed. 2 and its Supplements (14 volumes total)
b. Additional standards needed, proposals for building and delivering those vocabularies.
c. A strategy and timeline for moving the above and making useful ways of creating and delivering disambiguation services needed in taxonomy.

D. Discussion of progress on existing literature projects (e.g., BHL, INOTAXA, Plazi, and others as requested).
Intellectual property rights on databases - Issues and solutions (creativecommon, scienceincommon, etc) Maxime Thibon CANCELLED
Discussion of frameworks, workflows, and processes in relation to analysis of biodiversity data Paul Flemons There are a number of large biodiversity informatics projects proceeding around the world at the moment, such as EDIT and ALA, that will be incorporating sophisticated spatial data analysis tools for use in biodiversity research and assessment. It would be very useful to have a standard framework or approach to implementing these tools so that components could be shared effectively and efficiently between projects. Though standards exist for various parts of the process for these tools (such as data inputs and transfer - eg Darwin Core, WFS, WMS, WCS) there is no comprehensive standard or group of standards for the architectural frameworks or protocols that could enable such sharing of components between projects.

This working group will provide opportunity for participants to present for 5 to 10 minutes on work they are doing or planning to do or on issues and problems that they would like to address during the workshop. The aims of the workshop will then be finalised and informal discussion used to explore ideas, opportunities and issues encountered by participants in developing spatial analysis tools. At the very least the working group will provide a forum for robust exchange of ideas and experience. An optimal outcome would be a draft set of standard components and protocols which would provide a basis for further development of required standards in the coming months and years.

  Last Modified: 15 March 2010