Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: David Peter Shorthouse (davidpshorthouse@gmail.com)
Received: 30 Sep 2020 | Published: 09 Oct 2020
This is an open access article distributed under the terms of the CC0 Public Domain Dedication.
Citation:
Shorthouse DP, Pender J, Rabeler R, Macklin JA (2020) Digitization of US Herbaria - How close did we get to the 2020 goal? Biodiversity Information Science and Standards 4: e59166. https://doi.org/10.3897/biss.4.59166
|
|
A discussion session held at a National Science Foundation-sponsored Herbarium Networks Workshop at Michigan State University in September of 2004 resulted in a rallying objective: make all botanical specimen information in United States collections available online by 2020.
Given that we are now in the year 2020, it seems appropriate to examine the progress towards the objective of making all US botanical specimen collections data available online. Our presentation will attempt to answer several questions:
Given our interest in the success of both the Global Biodiversity Information Facility (GBIF) and the Integrated Digitized Biocollections (iDigBio), as well as the overwhelming likelihood that either one of these initiatives is the usual entry point for someone seeking US-based botanical data, we approached the answers to the above questions by first crafting a repeatable data download and processing workflow in early July 2020. This resulted in 25.6M records of plant, fungi, and Chromista from 216 datasets available through GBIF and 32.8M comparable records available through iDigBio from 525 recordsets. We attempted to align these seemingly discordant sets of records and also chose Darwin Core terms that were best suited to match the four hierarchical levels of digitization defined in the Minimal Information for Digital Specimens (MIDS) (
During the analysis/comparison of the datasets, we found several examples where the number of data records from an institution seemed much lower than expected. From a combination of analyzing record content in GBIF/iDigBio and consulting regional/taxonomic portals, it became evident that, besides datasets only being included in either GBIF or iDigBio, there was a significant number of records in regional/taxonomic portals that were not yet made available through either GBIF or iDigBio.
Progress on digitization has benefited greatly from the US National Science Foundation's creation of the Advancing Digitization of Biodiversity Collections (ADBC) program, and funding of the 15 Thematic Collection Networks (TCN). The launching of new projects and the ensuing digitization of herbarium collections have led to a multitude of new specimen portals and the enhancement of existing software like Symbiota (
digitization, aggregator, MIDS
David Peter Shorthouse
TDWG 2020