Biodiversity Information Science and Standards : Conference Abstract
PDF
Conference Abstract
Name IDs and Name Matching for Catalogue of Life: Existing Services and Prospects
expand article infoOlaf Bánki‡,§, Markus Döring|,§, Thomas S. Jeppesen|
‡ Naturalis Biodiversity Center, Leiden, Netherlands
§ Catalogue of Life / Species 2000, Weesp, Netherlands
| Global Biodiversity Information Facility, Copenhagen, Denmark
Open Access

Abstract

ChecklistBank, developed by Catalogue of Life (COL) and the Global Biodiversity Information Facility (GBIF), is a publishing platform and open data repository focused on taxonomic and nomenclatural data sets (checklists). It contains close to 50K datasets, mostly originating from digitised peer reviewed scientific articles mediated by Plazi, amongst others. The COL Checklist (Bánki et al. 2023) is assembled out of a selection of the data sources in ChecklistBank. The Catalogue of Life Checklist is issued with name usage identifiers, as well as a digital object identifier for the Checklist version (with an associated dataset key). The more than 160 data sources that make up the COL Checklist are also issued with digital object identifiers as well as a data set key. The combination of a name usage identifier and the data set key allows for the tracking of names between the various COL Checklist versions. ChecklistBank is built in an open API. It supports data sharing through various exchange formats of the Darwin Core (Darwin Core Task Group 2009) data standard (e.g., Darwin Core-Archives (GBIF 2021) and ColDP), and provides several download and name matching options.

The Transforming European Taxonomy through Training, Research, and Innovations (TETTRIs) European Union funded project will contribute to a couple of improvements to ChecklistBank. In the context of the TETTRIs project, a new name usage (i.e., taxon or synonym) matching service against any dataset in ChecklistBank, not just the COL Checklist, was developed. A single name matching service takes query parameters for a single name and optionally its classification. The service allows for bulk matching of names against the ChecklistBank API. This contains the option of matching a classification in a CSV file. The bulk matching allows all names of an entire or a subtree of an existing ChecklistBank dataset to act as the source for names instead of the input matching a CSV file. The bulk matching services are asynchronous and notify a user by email when the results are ready to be downloaded in a CSV file.

Keywords

taxonomy, ChecklistBank, ColDP, data standards

Presenting author

Markus Döring

Funding program

The Transforming European Taxonomy through Training, Research, and Innovations (TETTRIs) is a EU funded project with grant number 101081903.

Conflicts of interest

The authors have declared that no competing interests exist.

References

login to comment