Proceedings of TDWG, 2008

Australian Faunal Directory (AFD) and Australian Plant Census (APC): Content, Architecture and Services

Greg Whitbread, Helen Thompson, Matthew Hand

Abstract


The Australian Faunal Directory (AFD), the Australian Plant Census (APC) and the Australian Plant Name Index (APNI) are the most complete information resources available online for the taxonomy and nomenclature of Australian plants and animals. They are of high quality, authoritative, and widely accepted as the single point of truth for Australian taxa. Together they contribute the nomenclatural and taxonomic core for the Atlas of Living Australia (ALA).

The databases are housed within the Department of the Environment, Water, Heritage and the Arts (DEWHA) and maintained by the Australian Biological Resources Study (ABRS) and the Australian National Botanic Gardens/Centre for Plant Biodiversity Research (ANBG/CPBR). APC operates under the auspices of CHAH (Council of Heads of Australian Herbaria).

In partnership with the ALA we are integrating AFD and APC/APNI at the data level and building the services required for distribution using standard protocols and forms. The aim is to deliver vertical slices of the data set to client systems in a way that enables maintenance of data quality and update of extracts in place. In the process, we will improve data management, add Life Science Identifiers (LSIDs), increase usability, implement web services over the Taxon Concept Schema (TCS) and support the TAPIR protocol for interoperability. The work is designed to position the application architecture to provide taxonomic and nomenclatural services over the existing dataset and create the platform for a collaborative infrastructure supporting all aspects of taxon profile development.

The foundation for these services is a generic layer supporting XML-RPC, SOAP and REST that can handle all of the protocols we are asked to support – including TAPIR, OAI-PMH, SRW, LSID and XQUERY – and data models specified using the core ontology. The implementation should provide a path to participation in the Semantic Web.

A preliminary iteration uses an XML Java framework to mimic the operation of existing providers implementing TAPIR to SQL translation over hibernate like mappings between provider objects and the underlying database. However TCS, being inherently more complex than other TDWG schemas, is not well suited to SQL processing and the resulting TAPIR solution demands considerable dedicated code for each service and data model supported. The resulting system is not a generic solution.

Our current implementation takes a very generic approach using eXist-DB - an XML database framework - to provide the required services, query processing and database layers. The strong procedural capability of the XQUERY language and the ease with which TAPIR queries can be taken to XPATH expressions reduces TAPIR implementation to near declarative simplicity. Support for our mandatory protocols is pluggable using easy to write translation modules. Code is reduced to a minimum. All required services are bundled, semantic technologies supported and the integrated XSLT capabilities offer benefits for local content delivery. Importantly, direct publication of the production database to XML further simplifies support for multiple schemas through separation of data and provider deployment.

This solution is simple, elegant and portable.