Biodiversity Information Science and Standards : Conference Abstract
Print
Conference Abstract
Supporting Essential Biodiversity Variables: The GLOBIS case study
expand article infoLee Belbin, Donald Hobern§,|
‡ Atlas of Living Australia, CSIRO, Canberra, Australia
§ Global Biodiversity Information Facility, Copenhagen, Denmark
| International Barcode of Life, Canberra, Australia
Open Access

Abstract

Essential Biodiversity Variables (EBVs) are the latest push toward supporting state of the environment indicators (Pereira et al. 2013). The European Union Funded Creative-B Project (see https://cordis.europa.eu/project/rcn/100345/brief/en) outlined the status and strategy for interoperability between what they termed Biodiversity Research Infrastructures (BRIs: such as the Global Biodiversity Information Infrastructure (GBIF), the Atlas of Living Australia (ALA) and the Integrated Digitized Biocollections (iDigBio)). Toward the end of that project, the group decided that a logical follow-on project should position BRIs to support the production of Essential Biodiversity Variables (EBVs). This idea became the GLOBal Infrastructures for Supporting Biodiversity research (GLOBIS-B) project (http://www.globis-b.eu) and this presentation provides a summary of a case study on generating EBVs (Hardisty et al. 2019).

As a part of GLOBIS-B, I suggested that a small team of GLOBIS members should document in detail, each step in the production of an EBV from GBIF and the ALA data for a few invasive species. We wanted address the rarity of detailed recording and justification for each step in the production of a dataset for environmental evaluation. I anticipated that the team would encounter many practical issues, but this case study raised far more significant issues that any of us had anticipated.

The EBV chosen for this study was Area of Occupancy (IUCN Standards and Petitions Subcommittee 2017) and the species selected represented various invasion scenarios: Acacia longifolia; Vespula germanica and Bubulcus ibis. The workflow included 20 steps between locating data and publishing an EBV, and these steps were radically different between GBIF and the ALA. The workflow required manual steps such as resolving invasive status of Acacia longifolia subspecies; only one of which was ‘invasive’. Datasets of occurrence records had to be exported from the ALA and GBIF to enable filtering for purpose, for example, not all Darwin Core terms are exposed in the current public interface of the ALA. After the record filtering, the ALA and GBIF datasets then required merging and deduplication, for which one-off code had to be written.

A few of the 15 significant messages from this study included: a lack of consistency of data between BRIs (e.g., GBIF records should be a superset of ALA records); consistency and adequacy of filtering tools between BRIs; exported data structures massively differed between BRIs; that automation of the workflows may be possible but many manual intervention steps were required. By my figuring, the case study took approximately 10 times longer than anticipated, but the messages to BRIs was clear – consistency and adequacy of data and tools require urgent work.

Keywords

EBVs, workflow, filtering, data repositories, indicators, invasive

Presenting author

Donald Hobern

Presented at

Biodiversity_Next 2019

References

login to comment