Biodiversity Information Science and Standards : Conference Abstract
Conference Abstract
Generating Masks for Image Segmentation in Digitized Herbarium Specimens
expand article infoAlexander E White, Rebecca B Dikow, Makinnon Baugh§, Abby Jenkins§, Paul B Frandsen§
‡ Data Science Lab, Smithsonian Institution, Washington DC, United States of America
§ Brigham Young University, Provo UT, United States of America
Open Access


Digitized herbarium images contain complex information unrelated to the shape and color of the specimens represented within them. This information can contribute a substantial amount of noise if one is to use the image as a proxy for pattern, shape, or color of the specimen. Image segmentation, whereby the specimen material is partitioned from the background (e.g., herbarium sheet, label, color ramp), offers one possible solution, yet training data for image segmentation of herbarium specimens is nonexistent. We present a pipeline for generating training data for image segmentation tasks along with a novel dataset of highly resolved image masks segmenting plant material from background noise. This dataset can be used to train neural networks to segment plant material in herbarium sheets more generally, and our method is applicable to other museum data sources where masking may be useful for quantitative analysis of patterns and shapes


digitized herbarium specimens, segmentation, machine learning, image masks

Presenting author

Paul B. Frandsen

Presented at

Biodiversity_Next 2019