Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Nina Filippova (filippova.courlee.nina@gmail.com)
Received: 08 Sep 2021 | Published: 10 Sep 2021
© 2021 Nina Filippova, Dmitry Ageev, Sergey Bolshakov, Olga Vayshlya, Anastasia Vlasenko, Vyacheslav Vlasenko, Sergei Gashkov, Irina Gorbunova, Eugene Davydov, Elena Zvyagina, Nadezhda Kudashova, Maria Tomoshevich, Aleksandra Filippova, Natalia Shabanova, Lidia Yakovchenko, Irina Vorob'eva, Ludmila Kalinina, Ekaterina Palomozhnykh
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Filippova N, Ageev D, Bolshakov S, Vayshlya O, Vlasenko A, Vlasenko V, Gashkov S, Gorbunova I, Davydov EA, Zvyagina E, Kudashova N, Tomoshevich M, Filippova A, Shabanova N, Yakovchenko L, Vorob'eva I, Kalinina L, Palomozhnykh E (2021) The Fungal Literature-based Occurrence Database in Southern West Siberia (Russia). Biodiversity Information Science and Standards 5: e74178. https://doi.org/10.3897/biss.5.74178
|
The abstract presents the initiative to develop the Fungal Literature-based Occurrence Database for Southern West Siberia (FuSWS), which mobilizes occurrences of fungi from published literature (literature-based occurrences, Darwin Core MaterialCitation). The FuSWS database includes 28 fields describing species name, publication source, herbarium number (if exists), date of sampling or observation, locality information, vegetation, substrate, and others.
The initiative on digitization of literature-based occurrence data started in the northern part of Western Siberia two years ago (
Currently, the project is actively growing in spatial, collaboration and data accumulation terms. The working group of about 30 mycologists from 16 organizations dedicated to the digitization initiative was created as part of the Siberian Mycological Society (informal organization since 2019). They have created the most complete bibliographic list of mycology-related papers for the Southern West Siberia, including over 800 publications for the last two centuries (the earliest dated 1800). At abstract submission, the database had been populated with a total of about 10K records from about 100 sources. The dataset is uploaded to GBIF, where it is available for online search of species occurrences and/or download (
The screenshot of the dataset page with about 10K digitized literature-based records of fungi for the Southern West Siberia regions published in GBIF.
The following protocol describes the digitization workflow in detail:
The bibliography of related publications is compiled using Zotero bibliographic manager. Only published works (peer-reviewed papers, conference proceedings, PhD theses, monographs or book chapters) are selected. If possible, the sources are digitized and added to the library as PDF files.
The template of the FuSWS database is made with Google Sheets, which allows simultaneous use by several specialists, in a common data format provided. The simple Microsoft Excel template is also available for the offline databasing. The Darwin Core standard is applied to the database field structure to accommodate the relevant information extracted from the publications.
From the available bibliography of publications related to the region, only works with species occurrences are selected for the databasing purpose. The main source of occurrences is annotated species lists with exact localities of the records. However, different sorts of other species citations are also extracted, provided that they had the connection to any geography.
All occurrences are georeferenced, either from the coordinates provided in the paper, or from the verbatim description of the field work locality. The georeferencing of the verbatim descriptions is made using Yandex or Google map services. Depending on the quality of georeference provided in publications, the uncertainty is estimated as follows: 1) the coordinate of a fruiting structure or a plot provided in the publication gives the uncertainty about 3-30 meters; 2) the coordinate of the field work locality provided in publication gives the uncertainty about 500 m to 5 km; 3) the report of the species presence in a particular region gives the centroid of the area with the uncertainty radius to include its borders.
The locality names reported in Russian are translated to English and written in the «locality» field. Russian descriptions are reserved in the field «verbatimLocality» for accuracy.
When possible, the «eventDate» is extracted from the annotation data. Whenever this information is absent, the date of the publication is used instead with the remarks in the «verbatimEventDate» field.
The ecological features, habitat and substrate preferences are written in the «habitat» field and reserved in Russian.
The original scientific names reported in publications are filled in the «originalNameUsage» field. Correction of spelling errors is made using the GBIF Species Matching tool. This tool is also used to create the additional fields of taxonomic hierarchy from species to kingdom, to fill in the «taxonRank» field and to synonymize according to the GBIF Backbone Taxonomy.
To track the digitization process, a worksheet is maintained. Each bibliographic record has a series of fields to describe the digitization process and its results: the total number of extracted occurrence records, general description of the occurrence quality, presence of the observation date, details of georeferencing and the name of a person responsible for the digitization.
materialCitation, fungi, digitization, biodiversity data mobilization, GBIF
Nina Filippova
TDWG 2021