Quick Jump Menu

International Union of Biological Sciences

Taxonomic Databases Working Group


TDWG Newsletter

IUBS Taxonomic Databases Working Group Number 10 / June 2000

TDWG Web page: http://www.tdwg.org

Communications

To keep up-to-date with what is happening within TDWG, please look at the web site http://www.tdwg.org for details.

There are also three email lists set up to help communications:

TDWG@USOBI.ORG

for matters of general interest to TDWG members.

See http://usobi.org/archives/tdwg.html on how to join the list.

TDWG-Proc@USOBI.ORG

to discuss the TDWG Standards Process.

See http://usobi.org/archives/tdwg-proc.html on how to join the list, and http://www.tdwg.org/process/tdwg99_blum.html on the reasons for the discussion.

TDWG-SDD@USOBI.ORG

to discuss the Structure of Descriptive Data, helping the TDWG SDD subgroup to analyse the requirements for a new standard for descriptive data based on XML.

See http://usobi.org/archives/tdwg-sdd.html on how to join the list, and read the archives.

Please use these resources to make your views known, and help make TDWG a more effective organisation.

TDWG Meetings

 

TDWG 2000

The annual meeting will be held at the Senckenberg Museum, Frankfurt, Germany from the 10th to the 12th November, 2000. It is being organised by Michael Türkay of the Senckenberg and Walter Berendsohn of the Botanic Garden and Botanical Museum Berlin-Dahlem. The theme of this year's meeting is "Digitizing Biological Collections". For further information, please see

http://www.bgbm.fu-berlin.de/tdwg/2000/ or the TDWG web site. Announcements will be also made through mailing lists such as the TDWG list and Taxacom.

TDWG 1999 - The Highlights

The meeting was held on 29th - 31st October, 1999 at the

Harvard Herbarium, Cambridge, USA. The full report can be seen at http://www.tdwg.org/rep1999.html .

The main aim of the meeting was to clarify some general standards issues, such as the role of standards in interoperability, when recommendations might be more appropriate, and TDWG's general role in facilitating the development of standards and recommendations. To begin this, Stan Blum, from the California Academy of Sciences, described the different types of standards and suggested ways in which TDWG could be useful in the development of more widely accepted standards, and how the organisation of TDWG could be improved.

John Rumble, from the National Institute of Standards and Technology, and currently President of CODATA, continued this theme presenting the need for standards, and the standards process. He described how CODATA fits into this process, illustrating how it might be useful for TDWG to be associated with CODATA, thus beginning the debate of where TDWG should "sit" institutionally, to be more visible and effective. Walter Berendsohn (Botanischer Garten und Botanisches Museum, Berlin) reported that the Global Plant Checklist project he was involved in was "sponsored" by CODATA and they had found it very useful in bringing people together. But he also wanted the meeting to consider possibilities with GBIF, the Global Biodiversity Information Facility. One of their prime areas of concern was the creation of checklists, and digitisation of databases, and Walter thought it should be possible for TDWG to get secretarial support, and financial help to attend standardisation meetings etc. He asked for permission to raise this with GBIF officials, and there was general agreement that he should go ahead with this.

Moving on to interoperability, Dave Vieglas from the Museum of Natural History, University of Kansas, illustrated the use of a general information retrieval standard to retrieve biological information from very different resources - the Z39.50 search and retrieval protocol, which now forms the basis of a number of projects focussed primarily on collections, and has led to the development of "The Species Analyst", see http://habanero.nhm.ukans.edu/SpeciesAnalyst. Using this tool, it is possible to search several databases at once, and merge the results into applications such as MS Excel.

There were some presentations representing the Taxon by

Character data "problem", experienced by users of packages such as DELTA, LucID, PAUP, etc. One aim of the meeting was that TDWG could facilitate a dialogue among the software developers. A subgroup was formed to discuss the "Structure of Descriptive Data" (SDD), and

there has been very active discussion on the list server set up (http://usobi.org/archives/tdwg-sdd.html).

Stuart Poss (Gulf Coast Research Laboratory, Ocean Springs, MS) reported on Database-related Activities Associated with ICZ XVIII [18th International Zoological Congress] - zoology was very fractionated and that there had not been an international conference for 40 years. Information on the conference can be seen on the web site at http://lionfish.ims.usm.edu/~musweb/icz_xviii/icz_home.html.

During the TDWG business session, there were reports from the economic botany and geography subgroups, and on TL2. Francisco Pando (Paco) had sent in his report as secretary as he was unable to attend the meeting. After being secretary for 6 years, he was standing down. Everyone agreed that he should be thanked for his efforts, without which TDWG might no longer exist. Peter Stevens reported that he had approached Georgina MacKenzie to be the next secretary, and she had agreed. There was general acceptance of this and she was voted in. Peter Stevens presented the treasurer's report. Walter Berendsohn had agreed to organise the next meeting, but suggested that it should be at Senckenberg Museum in Frankfurt, which would accentuate the wider scope of TDWG.

 

Results of the BioCISE Project

This is an abbreivated version of the article by Walter Berendsohn, which can be seen in full at http://www.tdwg.org/news10/biocise.html.

BioCISE (Resource Identification for a Biological Collection Information Service in Europe) was a concerted action project financed under the 4th framework programme of the European Commission in 1998 and 1999, with 17 participants from 11 countries. The Concerted Action identified, analysed, and catalogued biological collection information resources, recorded interdisciplinary biodiversity database expertise, and analysed user needs for a European Biological Collection Information Service. A concept for the implementation of such a service was developed; and partnerships and proposals were initiated based on the resources identified and in liaison with other organisations, initiatives, and projects.

A large-scale survey of European collections was conducted throughout the project period aimed at the identification of collections, collection databases and biodiversity informatics expertise within the (initially) EU member states and (lately) all potential participants in the 5th Framework Programme, i.e. including the candidate countries for EU membership and the non EU-members among the Northern and Central European countries.

The BioCISE World Wide Web Collection Catalogue provides access to the survey's results. It can presently be queried by country, city, and collection category, as well as by means of a free text search. For 60 % of the laboratories which responded to the survey, the BioCISE Collection Catalogue is the first representation of their collections on the World Wide Web. In addition to the detailed survey results of respondents, all corroborated institutional addresses are accessible.

To support follow-up projects, the BGBM continues to host the WWW site, the database, and the questionnaires (in English, French, and German) for at least one year after the project's conclusion. (see http://www.bgbm.fu-berlin.de/biocise/).

A group of members with experience in information modelling has finalised the information model which is to provide a reference model useful for all biological collections. The result was published in a 50-page paper in a major scientific journal (Taxon) in 1999 (see http://www.bgbm.fu-berlin.de/biodivinf/docs/CollectionModel/). In co-operation with TDWG, a bibliography of collection information models and data resources was compiled and published on the WWW (see http://www.bgbm.fu-berlin.de/TDWG/acc/).

The principal objective of the project was to develop a strategy for the implementation of the Biological Collection Information Service and to instigate project proposals within the 5th Framework Programme and national programmes. It became clear that an all-out approach to enable access to collections in Europe on the level of the individual object (unit) is premature. A major challenge lies in the fact that long-term international funding for the service has to be kept low. The present thinking is therefore that three parallel approaches should be made: (1) foment collaboration of existing databases, e.g. by creating common interfaces to unit-level databases; (2) implementing an over all, flexible, metadata-driven access system to collections, which facilitates access to existing thematic, national, or regional networks and fills the gaps between them; and (3) to foment and instigate the creation and extension of such networks especially on the national level. The BioCISE project instigated and/or contributed to project proposals in all three areas.

 

Automatic compilation of accurate taxonomic databases from multiple non-computerised sources

Botany, like other mature sciences with a descriptive element, has extensive data locked in multiple overlapping natural language texts e.g. regional floras. The need to make these resources available in structured electronic form is urgent. However the problem of information extraction from text is amplified by the multiple sources and compounded by inconsistencies between them. For instance descriptions of the same plant species may be printed in several floras but using differing terminologies. Several parsing methods exist for extracting information from text but these have a number of limitations. This project aims to overcome these limitations by developing a tool for extracting data using techniques of electronic linguistic analysis that will exploit this redundancy in multiple parallel sources i.e. the several descriptions of the same taxon. It is expected that the union of the information derived from multiple sources will be both larger and more reliable than information from a single source. Whilst this tool is primarily aimed at biodiversity research once established it will hopefully be applicable in other sciences. The project is based at the University of Manchester, Department of Computer Sciences in collaboration with the Natural History Museum, London and is funded by the BBSRC EPSRC Bioinformatics Initiative . If you would like to know more about the project look at the website http://www.cs.man.ac.uk/ai/MultiFlora/.

 

International Plant Names Index is launched

The full version of this article by Eimear Nic Lughadh can be seen in full at http://www.tdwg.org/news10/ipni.html.

The Plant Names Project consortium has announced the Internet release of the International Plant Names Index (IPNI). This comprehensive listing of over 1.3 million scientific names for seed plants is the product of a collaboration between scientists and programmers at the Royal Botanic Gardens, Kew, U.K, the Harvard University Herbaria, USA, and the Centre for Plant Biodiversity Research, Canberra, Australia.

The Plant Names Project has adopted a number of the data standards endorsed by TDWG and new data added to IPNI will conform to these norms. However, retroactive implementation of these standards throughout the existing data is a mammoth undertaking, which is unlikely to be completed for some years. In order to encourage botanists worldwide to assist in this standardization process the PNP team is currently developing contributions software designed to enable users to make corrections to the data as they browse the index. For more information and access to the International Plant Names Index see http://www.ipni.org.

News on Standards

 

TL2

The following is a shortened version of an article written by Richard Pankhurst in March. Please refer to http://www.tdwg.org/news10/tl2.html for the full article.

A database of TL2 has been prepared, initially at the Missouri Botanical Garden and then at the Royal Botanic Garden Edinburgh, with financial support from the US Department of Agriculture (USDA). The accuracy of the data has been carefully checked. The data is being made freely available. The data fields that have been covered represent only a very small part of the information actually given in TL2. The database omits data on herbaria and types, the bibliography and biographical detail, and data on publishers and pagination. Data on various editions is not included unless TL2 gives them separate abbreviations.

In spite of its many virtues, TL2 is incomplete in various ways. The cutoff at 1940 means that many important publications are omitted. Even before that date, TL2 is does not include all references to publications containing new taxa by the authors listed, and there are certainly more minor authors who are omitted altogether. The supplements tend to contain rather a higher proportion of reprints, but they cover only up to letter G. The abbreviations for titles do have some consistency but they are not completely consistent, and need to be reworked to make them satisfactory.

In view of what has been said above, a number of improvements can be suggested

- Add the data from the last supplement

- Rework and reissue the abbreviations for titles

- For rare books, show which libraries have copies

- Add all the authors for multiple author works

- Add details of pagination with publication dates, for when a work did not appear all at one time

- Add details for all multiple editions

- Regularise the citation of reprints, with cross reference to the journal standard (BPH/S)

- Provide all references for existing authors, and add minor authors. This might amount to the same thing as completing the series of supplements.

- Add data for modern publications (and keep the catalogue continuously up to date). This might in practice amount to the same thing as cross-referencing TL2 to Index Kewensis (or its modern version, the International Plant Name Index).

- Since so many of the original publications are rare or obscure works, make photo images of the text and distribute in electronic form. This might be seen as a modern way of implementing the microfilm libraries which already exist

- Make TL2 available on CDROM (or DVD-ROM) and/or on the Internet

 

HISCOM/HISPID

Report on HISCOM2000:

The Herbarium Information Systems Committee (HISCOM) recently met in Darwin, Australia (2-5 May 2000). With representatives from all major State and Territory Australian mainland herbaria and invited guests from the Australian Biological Resources Study, CSIRO Publishing, KE Software (all Australia), and LandCare Research (New Zealand).

The major issues discussed by the meeting included:

1. The revision of HISPID3 (as HISPID4). A draft version of HISPID4 is available on the Internet as a searchable database (http://plantnet.rbgsyd.gov.au/Hispid4) or visit the HISCOM website (http://www.rbgsyd.gov.au/HISCOM/). Note: the HISPID4 database does not give examples or more detailed discussion (as present in HISPID3) at this stage. Furthermore, several corrections are yet to be completed.

2. The revision of the prototype Virtual Australian Herbarium (VAH) website. The VAH is a searchable distributed database system, that currently searches the accession databases of the major Australian herbaria and presents the information as a species distribution map. Currently, there is only restricted access to this site. The future development of the VAH includes the development of an external view to the data and the inclusion of other flora information, such as descriptions and images.

Barry Conn (Past-Coordinator of HISCOM)

BPH

This is an abbreviated form of the report by Richard Pankhurst, which can be seen in full at http://www.tdwg.org/news10/bph.html.

The Botanico-Periodicum Huntianum (or BPH for short) is a TDWG standard. To quote from the introduction it 'is a compendium on all periodical publications that regularly contain articles dealing with botanical literature'. BPH was first published by the Hunt Botanical Library, Pittsburgh, in 1968, and the Supplementum (BPH/S) in 1991. These two books need to be used together, since the supplement does not repeat some of the data already contained in the first volume, while it also lists all the old titles and adds many new ones as well, and presents new spellings for transliterated titles for several countries. BPH is the obvious standard to use for the titles and abbreviations of journals when compiling plant taxonomic databases. Experience in using BPH with the PANDORA taxonomic database suggests that it is very comprehensive. It is very unusual not to be able to find there the title of a journal containing published plant scientific names. As in the case of TL2 (for botanical books) it is the publications that constitute the standard, and NOT the corresponding database.

The database:

The BPH/S publication was word-processed, and the text file was kindly made available by Bob Kiger at the Hunt Institute. Manual editing to delimit the text into database fields and to remove superfluous periods at the ends of sentences was carried out at the Berlin Botanical Garden (Walter Berendsohn), the Harvard University Herbaria (Jim Beach and Nora Murphy) and at the Royal Botanic Garden, Edinburgh. The database has a single table and each record corresponds to a journal title. It turns out that a few of the titles used for preceding or superceding journals lack their own entries in the table, but these are usually very obscure titles. The main deficiency of the current database is that much of the series data which is included in the 1968 volume has not been entered.

Membership

Use the form included, or contact John Wiersema, TDWG Treasurer, c/o USDA, ARS, SBML, Building 011A, Room 304, BARC-West, Beltsville, MD 20705-2350 USA, or jwiersema@ars-grin.gov for details.

Newsletter

Please send your contributions and ideas to the Newsletter Editor: Georgina MacKenzie,

54 Micklegate,YORK, YO1 6WF, UK

Fax: +44 1904 612793

Email: gmackenzie@york.biosis.org

1999 - 2000 Executive
Chairperson:
 Peter Stevens, Missouri, USA
 peter.stevens@mobot.org
Secretary & Newsletter Editor:
 Georgina MacKenzie, York, UK
 gmackenzie@york.biosis.org
Treasurer:
 John Wiersema, Beltsville, MD, USA
 jwiersema@ars-grin.gov

Members:
 Stan Blum, California, USA
 sblum@calacademy.org

 Gail Kampmeier, Illinois, USA
 gkamp@uiuc.edu

Regional Secretaries:
Asia:
 Junko Shimura, Saitama, JAPAN
 junko@ulmus.riken.go.jp
Europe:
 Walter Berendsohn, Berlin, GERMANY
 wgb@zedat.fu-berlin.de
Latin America:
 [position vacant]
North America:
 David Boufford, Cambridge, MA, USA
 boufford@oeb.harvard.edu
Oceania:
 Barry Conn, Sydney, AUSTRALIA
 barry@rbgsyd.gov.au


Last Updated: February 5, 2003