Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Corinna Gries (cgries@wisc.edu)
Received: 28 Sep 2020 | Published: 30 Sep 2020
© 2020 Corinna Gries, Stace Beaulieu, Renée Brown, Gastil Gastil-Buhl, Sarah Elmendorf, Hsun-Yi Hsieh, Li Kui, Greg Maurer, John Porter
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Gries C, Beaulieu S, Brown RF, Gastil-Buhl G, Elmendorf S, Hsieh H-Y, Kui L, Maurer G, Porter JH (2020) Change in Pictures: Creating best practices in archiving ecological imagery for reuse. Biodiversity Information Science and Standards 4: e59082. https://doi.org/10.3897/biss.4.59082
|
The research data repository of the Environmental Data Initiative (EDI) is building on over 30 years of data curation research and experience in the National Science Foundation-funded US Long-Term Ecological Research (LTER) Network. It provides mature functionalities, well established workflows, and now publishes all ‘long-tail’ environmental data. High quality scientific metadata are enforced through automatic checks against community developed rules and the Ecological Metadata Language (EML) standard. Although the EDI repository is far along in making its data findable, accessible, interoperable, and reusable (FAIR), representatives from EDI and the LTER are developing best practices for the edge cases in environmental data publishing. One of these is the vast amount of imagery taken in the context of ecological research, ranging from wildlife camera traps to plankton imaging systems to aerial photography. Many images are used in biodiversity research for community analyses (e.g., individual counts, species cover, biovolume, productivity), while others are taken to study animal behavior and landscape-level change.
Some examples from the LTER Network include: using photos of a heron colony to measure provisioning rates for chicks (
It has been standard practice to publish numerical data extracted from images in EDI; however, the supporting imagery generally has not been made publicly available. Our goal in developing best practices for documenting and archiving these images is for them to be discovered and re-used. Our examples demonstrate several issues. The research questions, and hence, the image subjects are variable. Images frequently come in logical sets of time series. The size of such sets can be large and only some images may be contributed to a dedicated specialized repository. Finally, these images are taken in a larger monitoring context where many other environmental data are collected at the same time and location.
Currently, a typical approach to publishing image data in EDI are packages containing compressed (ZIP or tar) files with the images, a directory manifest with additional image-specific metadata, and a package-level EML metadata file. Images in the compressed archive may be organized within directories with filenames corresponding to treatments, locations, time periods, individuals, or other grouping attributes. Additionally, the directory manifest table has columns for each attribute. Package-level metadata include standard coverage elements (e.g., date, time, location) and sampling methods. This approach of archiving logical ‘sets’ of images reduces the effort of providing metadata for each image when most information would be repeated, but at the expense of not making every image individually searchable. The latter may be overcome if the provided manifest contains standard metadata that would allow searching and automatic integration with other images.
data repository, metadata, ecological data
Corinna Gries
TDWG 2020
LTER
Authors have all contributed equally to writing this abstract.