Proceedings of TDWG, 2008

A standard information transfer model scopes the ontologies required for observations

Simon J.D. Cox, Laurent Lefort

Abstract


A standard information model and data transfer encoding for observations and measurements (O&M) has been issued by the Open Geospatial Consortium (OGC), and is on track for publication by ISO. Based on an analysis by Fowler and Odell, an observation is modeled as an event whose result is an estimate of the value of some property of the feature of interest. The terminology matches the "feature" meta-model used in the geospatial community, but can be applied to non-spatial investigations, so provides general vocabulary for scientific observations. The core model has been validated in geology, geophysics, chemistry, water resources, climate science, environmental monitoring, security and intelligence and taxonomy. An auxiliary model provides for the description of the sample design, in terms of sampling points, curves, sections and specimens. Adoption of a standard terminology supports cross-discipline data interoperability - important for many studies in the natural sciences and for problem solving in the natural environment.

The model provides a structure for observation data. Domain specialization requires the development and management of vocabularies and ontologies to be used as values for elements within a data instance. These include:

• Domain feature types (e.g. organism, organism occurrence, ecosystem)
• Observable properties (e.g. location, taxon, size, frequency)
• Observation procedures
• Scales and reference systems for observation results (including taxonomies)
• Sampling-feature relationship types (e.g. part-whole, manifolds, networks and topology)
• Specimen preparation procedures

The first two are tightly linked: The observed property must be associated with the type of the feature of interest. Strict application of the model may require the domain ontology to include previously unrecognized feature types. For example, the notion of Organism Occurrence, with observable properties place, time and taxon, is useful in the taxonomic community.

The other aspect of domain specialization is to establish patterns or extensions for sampling features. For example, ecosystem surveys may involve both spatial sampling in the field using quadrats and ex-situ observations on specimens. The sampling feature model may be used as-is, but additional domain requirements have led to specialization in at least two applications: Climate Science Markup Language, and the GeoSciML Borehole model.

Ontology-strengthened profiles complement the standardization of the generic data structures and facilitate interoperability within as well as across communities. The challenge here is to coordinate the efforts of multiple communities and factor governance arrangements which can produce a consistent framework rather than a series of independently developed vocabularies. Adopting the O&M skeleton defines the scope of these ontologies. The skeleton can therefore facilitate the coordination of efforts at inter-community level and lead to the construction of more stable ontologies and vocabularies supporting the development of observational data services standards and of multi-disciplinary applications.