Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Lauren Gillespie (gillespl@cs.stanford.edu)
Received: 06 Sep 2021 | Published: 10 Sep 2021
© 2021 Lauren Gillespie, Megan Ruffley, Moisés Expósito-Alonso
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Gillespie L, Ruffley M, Expósito-Alonso M (2021) An Image is Worth a Thousand Species: Scaling high-resolution plant biodiversity prediction to biome-level using citizen science data and remote sensing imagery. Biodiversity Information Science and Standards 5: e74052. https://doi.org/10.3897/biss.5.74052
|
|
Accurately mapping biodiversity at high resolution across ecosystems has been a historically difficult task. One major hurdle to accurate biodiversity modeling is that there is a power law relationship between the abundance of different types of species in an environment, with few species being relatively abundant while many species are more rare. This “commonness of rarity,” confounded with differential detectability of species, can lead to misestimations of where a species lives. To overcome these confounding factors, many biodiversity models employ species distribution models (SDMs) to predict the full extent of where a species lives, using observations of where a species has been found, correlated with environmental variables. Most SDMs use bioclimatic environmental variables as the dependent variable to predict a species’ range, but these approaches often rely on biased pseudo-absence generation methods and model species using coarse-grained bioclimatic variables with a useful resolution floor of 1 km-pixel.
Here, we pair iNaturalist citizen science plant observations from the Global Biodiversity Information Facility with RGB-Infrared aerial imagery from the National Aerial Imagery Program to develop a deep convolutional neural network model that can predict the presence of nearly 2,500 plant species across California. We utilize a state-of-the-art multilabel image recognition model from the computer vision community, paired with a cutting-edge multilabel classification loss, which leads to comparable or better accuracy to traditional SDM models, but at a resolution of 250m (
biodiversity mapping, machine learning, species distribution models
Lauren Gillespie
TDWG 2021
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1656518 and the TomKat Center Graduate Fellow for Translational Research
The author reports no outstanding conflicts of interest