Proceedings of TDWG, 2007

TDWG Standards in use within the Global Biodiversity Information Facility (GBIF) Data Portal

Tim Robertson

Abstract


This presentation will include a very high level overview of the Biodiversity Data Portal (http://data.gbif.org) offered by the Global Biodiversity Information Facility (GBIF http://www.gbif.org). The process of harvesting, parsing, and efficiently serving data for graphic user interface (GUI) tools and reporting services will be covered, illustrating the heavy dependency on TDWG standards. An overview of the mechanism employed to normalise the incoming data from various formats will be explained. This will highlight a use for a Universal Biodiversity Data Bus, which is a common set of standards for publishing, discovering and accessing data across the Internet.

From this overview, non technical participants will receive an insight into the data flow involved, some of the limitations faced, and how important TDWG formats are when processing data. It is expected that this will form a good basis for subsequent technical discussions relating to the Universal Biodiversity Data Bus.

The data within the GBIF network is collated using Distributed Generic Information Retrieval (DiGIR) , the Biological Collection Access Service for Europe (BioCASE), and the TDWG Access Protocol for Information Retrieval (TAPIR). These are all protocols encapsulating various versions of DwC (Darwin Core 2) and Access to Biological Collections Data (ABCD), and the data is served to the public through the new GBIF Data Portal in many forms including DwC and the Taxonomic Concept Schema (TCS) and employing Life Science IDentifiers (LSIDs).