Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Erica Krimmel (ekrimmel@fsu.edu)
Received: 28 Sep 2020 | Published: 02 Oct 2020
This is an open access article distributed under the terms of the CC0 Public Domain Dedication.
Citation:
Krimmel E, Mast A, Paul D, Bruhn R, Rios N, Shorthouse DP (2020) Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses. Biodiversity Information Science and Standards 4: e59067. https://doi.org/10.3897/biss.4.59067
|
Genomic evidence suggests that the causative virus of COVID-19 (SARS-CoV-2) was introduced to humans from horseshoe bats (family Rhinolophidae) (
The project underscores the value of biodiversity data aggregators iDigBio and the Global Biodiversity Information Facility (GBIF), which are sources for 58,617 and 79,862 records, respectively, as of July 2020, of horseshoe bat and relative specimens held by over one hundred natural history collections. Although much of the specimen-based biodiversity data served by iDigBio and GBIF is high quality, it can be considered raw data and therefore often requires additional wrangling, standardizing, and enhancement to be fit for specific applications. The project will create efficiencies for the coronavirus research community by producing an enhanced, research-ready data product, which will be versioned and published through Zenodo, an open-access repository (see doi.org/10.5281/zenodo.3974999).
In this talk, we highlight lessons learned from the initial phases of the project, including deduplicating specimen records, standardizing country information, and enhancing taxonomic information. We also report on our progress to date, related to enhancing information about agents (e.g., collectors or determiners) associated with these specimens, and to georeferencing specimen localities. We seek also to explore how much we can use the added agent information (i.e., ORCID iDs and Wikidata Q identifiers) to inform our georeferencing efforts and to support crediting those collecting and doing identifications. The project will georeference approximately one third of our specimen records, based on those lacking geospatial coordinates but containing textual locality descriptions.
We furthermore provide an overview of our holistic approach to enhancing specimen records, which we hope will maximize the value of the bat specimens at the center of what has been recently termed the "extended specimen network" (
natural history collection, COVID-19, biodiversity informatics
Erica Krimmel
TDWG 2020
We would like to acknowledge our partners Pam Soltis at the Univ. of Florida, Nancy Simmons at the American Museum of Natural History, and Nathan Upham at Arizona State Univ., as well as all of the collections professionals involved in curating and digitizing specimens included in our data.
National Science Foundation (NSF) Program: COVID-19 Research