Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Pieter Huybrechts (pieter.huybrechts@plantentuinmeise.be)
Received: 06 Sep 2021 | Published: 07 Sep 2021
© 2021 Pieter Huybrechts, Maarten Trekels, Quentin Groom
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Huybrechts P, Trekels M, Groom Q (2021) Estimating the Completeness of Preserved Collections in Representing Global Biodiversity. Biodiversity Information Science and Standards 5: e74032. https://doi.org/10.3897/biss.5.74032
|
|
There are an estimated 8.7 million eukaryotic species globally and knowledge of those organisms is organised about their scientific names and the specimens we have of those species (
Dealing with non-homogeneous and non-random, but incomplete, sampling of sites is a common issue that occurs in many ecological studies (
Nevertheless, to calculate on such large datasets we need to employ innovative Big Data analytic tools. GBIF contains 1.8 billion observations that amount to 120 GB of data compressed. This can then be interrogated in the cloud or locally using tools such as Galaxy, which has made it possible to process large numbers of records in a single batch. We can now evaluate the biodiversity within collections, and divide the result by taxon and geographical region, and compare them to one another.
Ultimately, this work will allow individual collections and consortia to evaluate their coverage of biodiversity and help them better target their collecting strategies.
specimen, natural history, big data, GBIF, extrapolation
Pieter Huybrechts
TDWG 2021
This work was facilitated by the Research Foundation – Flanders research infrastructure under grant number FWO I001721N