Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Donat Agosti (agosti@plazi.org)
Received: 19 Sep 2021 | Published: 20 Sep 2021
© 2021 Donat Agosti
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Agosti D (2021) (Re)Discovering Known Biodiversity: Introduction. Biodiversity Information Science and Standards 5: e75491. https://doi.org/10.3897/biss.5.75491
|
Biodiversity sciences, including taxonomy, are empirical sciences where all results are published in scholarly publications as part of the research life cycle. This creates a corpus of an estimated 500 million printed pages (
Following standard scientific practice, previous publications, specimens, gene sequences, or taxonomic treatments (
However, in today's digital age, each of these kinds of implicit links is an expensive stumbling block to access and reuse of the referenced data, its parent publications and the cited referenced data therein. Inadequate formats, language and access to taxonomic information were already recognized in 1992 at the Rio Summit (Taxonomic Impediment). The consequences of these impediments are only now obvious with the realization of the daunting amount of human resources needed to digitally catalogue and index this unknown (not discoverable and inaccessible) known knowledge, let alone making the data itself findable, accessible, interoperable and reusable (FAIR). This is a formidable and complex scientific challenge.
Plazi is taking on this challenge. Its vision is to promote and enable the discovery and liberation of data to transform the unknown known data into digitally accessible knowledge, i.e., to build a digital knowledge base aimed at discovering all the species (and other taxa) we know, and what we know about them. Taxonomic publications with their highly standardized taxonomic names, taxonomic treatments, treatment citations, material citations and illustrations are well suited to machine extraction. Together they include the entire catalogue of life with all the discovered species and their synonyms, often tens to hundreds of treatments, and figures that depict the myriad forms that comprise the world’s biodiversity. Once these data are FAIR, it allows bidirectional linking, for example of taxonomic names to the referenced taxonomic treatment, other digital resources such as gene sequences or digital specimens. At the same time, each datum is an entry point to the wealth of information that can be followed by the human user by clicking the links, but more importantly, analysed by machines. Here, digitally accessible knowledge will be defined in the context of discovering known biodiversity, including strategies of how to approach the challenge, which then will be detailed in subsequent talks in this symposium.
This symposium is based on Plazi’s ongoing data liberation and discovery supported by the European Union (e.g. Biodiversity Community Integrated Knowledge Library BiCIKL), United States (e.g. NIH) and Swiss research funding (e.g. e-BioDiv and the Arcadia Fund), collaboration with publishers (e.g. Pensoft, Muséum national d'Histoire naturelle, Consortium of European Taxonomic Facilities Publications, the Zenodo repository, Biodiversity Heritage Library), and data reusers like the Global Biodiversity Information Facility, Ocellus, Synospecies and openBiodiv. Currently, over 500,000 taxonomic treatments and 300,000 illustrations have been liberated and are accessible through TreatmentBank and the Biodiversity Literature Repository.
data liberation, digitally accessible knowledge, FAIR
Donat Agosti
TDWG 2021
Swiss universities; Arcadia Fund; The BiCIKL project receives funding from the European Union's Horizon 2020 Research and Innovation Action under grant agreement No 101007492