Proceedings of TDWG, 2007

LSID and TCS deployment in the Catalogue of Life

Richard John White, Andrew C Jones, Ewen R Orme

Abstract


This paper describes a project to add support for Life Sciences Identifiers (LSIDs) and the Taxon Concept Schema (TCS) to the Annual and Dynamic Checklists assembled and delivered by the Catalogue of Life (CoL) partners, Species 2000 and ITIS. We plan to improve the compatibility of the protocols and public software interfaces used by Species 2000 with TDWG standards. We wish to increase the usefulness of the CoL to users, including GBIF, by improving the CoL’s compatibility with other biodiversity tools, by supplying its information to clients expressed as taxon concepts, and by enhancing interoperability between data providers and consumers by means of LSIDs referring to these concepts. It is hoped this will increase the use of TDWG standards, accelerate LSID deployment and the uptake of TCS, assist providers and users to ascribe data unambiguously to specified taxon concepts, and speed the growth of shared biodiversity data resources.

At Cardiff University we are investigating approaches for adding LSID and TCS support to the CoL and implementing them in evaluation versions of its systems. We have implemented a new prototype of the Annual Checklist which issues LSIDs for taxon concepts and established a resolution service to support the use of these LSIDs by giving provisional RDF/TCS responses generated from the Annual Checklist.

We are developing modified Spice protocols and a new Spice software prototype to provide LSIDs and TCS data in response to Web Service requests and to receive any name or taxon concept LSIDs from data providers. A new version of one of the data providers is being implemented for this purpose. We will develop a validation tool to check that the data and responses are valid, correctly structured and internally consistent. We plan to complete the project by the end of December 2007.

The Species 2000 Secretariat in Reading is assisting in this project. Its responsibilities are to survey the needs, capabilities and preferences of data providers and users in the light of these demonstration systems; to deploy the enhanced Spice software in the CoL global and European regional hubs; to use the validation tool and other means to perform testing and quality assurance of the data served; and to assist the CoL partners to agree a plan for the introduction of LSIDs.

The updated Spice protocol, documentation and enhanced Spice software will be available for use by other projects to build species information systems for their own purposes and to create regional hubs which can be linked to the CoL, both to enhance its usefulness in those regions and to help set up new global data providers.

Planning and carrying out this project has raised a number of interesting questions, some to be resolved during the project, others for wider consideration and future research. They include the choice of which kinds of entity will be identified by LSIDs (including names as well as taxon concepts), how users (human or software) will obtain LSIDs for entities of interest, how any GUIDs (not necessarily LSIDs) that data providers supply will be propagated through the CoL, users’ expectations concerning tasks that LSIDs might assist, including navigating the taxonomic hierarchy and linking data to taxa, and the role of CoL LSIDs in building the biodiversity information systems of the future.

Further information about this project and its progress, updated periodically, is at http://spice.cs.cf.ac.uk/lsid/