Integrated Open Taxonomic Access (INOTAXA) Pilot
Anna Weitzman, Christopher Lyal, Cynthia Sims Parr, Farial Shahnaz
Abstract
Both taxonomists and those who need taxonomic information require greater access to material held in natural history museums and similar large biological repositories and their libraries. These repositories hold a wealth of inadequately accessible resources that describe and explain the diversity and complexity of life on earth. Mining these data for research, conservation, drug discovery, protected area management, disease control, education, enjoyment of the natural world, etc., is difficult, time consuming, and often leads to redundant efforts. What should be a seamless, open “book” of knowledge consists, instead, of disparate, unintegrated sets of data - some in electronic form but most still on paper, and both published and unpublished.
Information held in museums centers on the following types of biological datasets: specimen collections, taxonomic databases, published taxonomic literature, geographical information systems, and unpublished archival materials. Making these information sources available is part of a larger, worldwide effort to enable easy access to the complete range of data required to understand individual species and their environmental and evolutionary relationships. This will require the establishment of cross-linkages between, and simultaneous access to, datasets from such information sources throughout the world.
As a start on this important task, we are in the preliminary stages of developing the INOTAXA portal, which uses an XML schema, taXMLit, for literature markup. The portal will be a web workspace in which taxonomic descriptions, identification keys, catalogues, names, specimen data, images and other resources can be accessed simultaneously according to user-defined needs. It will allow access to data held in multiple servers, and will use a distributed data model. If, in the future, the various nomenclatural Codes permit web publication of new taxonomic names and acts, INOTAXA will be able to integrate single descriptions placed on servers worldwide, so long as they are indexed through a registry such as the one operated by the Global Biodiversity Information Facility, GBIF. The portal will be built on open source software that will be made freely available to easily set up at sites, as desired, worldwide. We will demonstrate the software and solicit feedback on the interface and functionality.
Information held in museums centers on the following types of biological datasets: specimen collections, taxonomic databases, published taxonomic literature, geographical information systems, and unpublished archival materials. Making these information sources available is part of a larger, worldwide effort to enable easy access to the complete range of data required to understand individual species and their environmental and evolutionary relationships. This will require the establishment of cross-linkages between, and simultaneous access to, datasets from such information sources throughout the world.
As a start on this important task, we are in the preliminary stages of developing the INOTAXA portal, which uses an XML schema, taXMLit, for literature markup. The portal will be a web workspace in which taxonomic descriptions, identification keys, catalogues, names, specimen data, images and other resources can be accessed simultaneously according to user-defined needs. It will allow access to data held in multiple servers, and will use a distributed data model. If, in the future, the various nomenclatural Codes permit web publication of new taxonomic names and acts, INOTAXA will be able to integrate single descriptions placed on servers worldwide, so long as they are indexed through a registry such as the one operated by the Global Biodiversity Information Facility, GBIF. The portal will be built on open source software that will be made freely available to easily set up at sites, as desired, worldwide. We will demonstrate the software and solicit feedback on the interface and functionality.