Biodiversity Information Science and Standards : Conference Abstract
Conference Abstract
Filling Gaps in Earthworm Digital Diversity in Northern Eurasia from Russian-language Literature
expand article infoMaxim Shashkov, Natalya Ivanova, Sergey Ermolov§,
‡ Institute of Mathematical Problems of Biology RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Russia
§ Center for Forest Ecology and Productivity RAS, Moscow, Russia
Open Access


Data availability for certain groups of organisms (ecosystem engineers, invasive or protected species, etc.) is important for monitoring and making predictions in changing environments. One of the most promising directions for research on the impact of changes is species distribution modelling. Such technologies are highly dependent on occurrence data of high quality (Van Eupen et al. 2021). Earthworms (order Crassiclitellata) are a key group of organisms (Lavelle 2014), but their distribution around the globe is underrepresented in digital resources. Dozens of earthworm species, both widespread and endemic, inhabit the territory of Northern Eurasia (Perel 1979), but extremely poor data on them is available through global biodiversity repositories (Cameron 2018). There are two main obstacles to data mobilisation. Firstly, studies of the diversity of earthworms in Northen Eurasia have a long history (since the end of the nineteenth century) and were conducted by several generations of Soviet and Russian researchers. Most of the collected data have been published in "grey literature", now stored only in a few libraries. Until recently, most of these remained largely undigitised, and some are probably irretrievably lost. The second problem is the difference in the taxonomic checklists used by Soviet and European researchers. Not all species and synonyms are included in the GBIF (Global Biodiversity Information Facility) Backbone Taxonomy. As a result, existing earthworm species distribution models (Phillips 2019) potentially miss a significant amount of data and may underestimate biodiversity, and predict distributions inaccurately. To fill this gap, we collected occurrence data from the Russian language literature (published by Soviet and Russian researchers) and digitised species checklists, keeping the original scientific names.

To find relevant literature, we conducted a keyword search for "earthworms" and "Lumbricidae" through the Russian national scientific online library eLibrary and screened reference lists from the monographs of leading Soviet and Russian soil zoologist Tamara Perel (Vsevolodova-Perel 1997, Perel 1979). As a result, about 1,000 references were collected, of which 330 papers had titles indicating the potential to contain data on earthworm occurrences. Among these, 219 were found as PDF files or printed papers. For dataset compilation, 159 papers were used; the others had no exact location data or duplicated data contained in other papers. Most of the sources were peer-reviewed articles (Table 1). A reference list is available through Zenodo (Ivanova et al. 2023).

Table 1.

Publication types

Publication type Number of papers
Journal articles (peer-reviewed) 135
Monographs 4
PhD thesis 1
Proceedings 5
Conference abstracts 14

The earliest publication we could find dates back to 1899, by Wilhelm Michaelsen. The most recent publication is 2023. About a third of the sources were written by systematists Iosif Malevich and Tamara Perel. 

Occurrence data were extracted and structured according to the Darwin Core standard (Wieczorek et al. 2012). During the data digitisation process, we tried to include as much primary information as possible. Only one tenth of the literature occurrences contained the geographic coordinates of locations provided by the authors. The remaining occurrences were manually georeferenced using the point-radius method (Wieczorek et al. 2010).

The resulting occurrence dataset Earthworm occurrences from Russian-language literature (Shashkov et al. 2023) was published through the Global Biodiversity Information Facility portal. It contains 5304 occurrences of 117 species from 27 countries (Fig. 1).

Figure 1.

Schema of earthworm occurrence. Dataset from Shashkov et al. (2023).

To improve the GBIF Backbone Taxonomy, we digitised two catalogues of earthworm species published for the USSR (Perel 1979) and Russian Federation (Vsevolodova-Perel 1997) by Tamara Perel. Based on these monographs, three checklist datasets were published through GBIF (Shashkov 2023b, 124 records; Shashkov 2023c, 87 records; Shashkov 2023a, 95 records). Now we work towards including these names in the GBIF Backbone so that all species names can be matched and recorded exactly as mentioned in papers published by Soviet and Russian researchers.


manual data extraction, heritage literature, grey literature, Oligochaeta

Presenting author

Maxim Shashkov

Presented at

TDWG 2023

Funding program

The research was supported by grant №23-24-00112 from the Russian Science Foundation

Grant title

Quantifying the Factors Limiting the Distribution of Earthworms in European Russia: a Model Approach

Hosting institution

Institute of Mathematical Problems of Biology RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences

Conflicts of interest

The authors have declared that no competing interests exist.


login to comment