Collections Management and High-Throughput Digitization using Distributed Cyberinfrastructure Resources

Gary Motz; Alexander Zimmerman; Kimberly  Cook; Alyssa Bancroft

doi:10.3897/biss.2.25643

Biodiversity Information Science and Standards : Conference Abstract

Conference Abstract

Collections Management and High-Throughput Digitization using Distributed Cyberinfrastructure Resources

Gary J Motz^‡,§, Alexander N Zimmerman^‡, Kimberly J Cook^‡, Alyssa M Bancroft^§

‡ Indiana University, Bloomington, IN, United States of America

§ Indiana Geological and Water Survey, Bloomington, IN, United States of America

Corresponding author: Gary J Motz (garymotz@indiana.edu)

Received: 09 Apr 2018 | Published: 05 Jul 2018

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Citation: Motz G, Zimmerman A, Cook K, Bancroft A (2018) Collections Management and High-Throughput Digitization using Distributed Cyberinfrastructure Resources. Biodiversity Information Science and Standards 2: e25643. https://doi.org/10.3897/biss.2.25643

Abstract

Collections digitization relies increasingly upon computational and data management resources that occasionally exceed the capacity of natural history collections and their managers and curators. Digitization of many tens of thousands of micropaleontological specimen slides, as evidenced by the effort presented here by the Indiana University Paleontology Collection, has been a concerted effort in adherence to the recommended practices of multifaceted aspects of collections management for both physical and digital collections resources. This presentation highlights the contributions of distributed cyberinfrastructure from the National Science Foundation-supported Extreme Science and Engineering Discovery Environment (XSEDE) for web-hosting of collections management system resources and distributed processing of millions of digital images and metadata records of specimens from our collections.

The Indiana University Center for Biological Research Collections is currently hosting its instance of the Specify collections management system (CMS) on a virtual server hosted on Jetstream, the cloud service for on-demand computational resources as provisioned by XSEDE. This web-service allows the CMS to be flexibly hosted on the cloud with additional services that can be provisioned on an as-needed basis for generating and integrating digitized collections objects in both web-friendly and digital preservation contexts. On-demand computing resources can be used for the manipulation of digital images for automated file I/O, scripted renaming of files for adherence to file naming conventions, derivative generation, and backup to our local tape archive for digital disaster preparedness and long-term storage.

Here, we will present our strategies for facilitating reproducible workflows for general collections digitization of the IUPC nomenclatorial types and figured specimens in addition to the gigapixel resolution photographs of our large collection of microfossils using our GIGAmacro system (e.g., this slide of conodonts). We aim to demonstrate the flexibility and nimbleness of cloud computing resources for replicating this, and other, workflows to enhance the findability, accessibility, interoperability, and reproducibility of the data and metadata contained within our collections.

Keywords

macrophotography, cyberinfrastructure, collections management, microfossil

Presenting author

Gary J Motz

Presented at

SPNHC+TDWG 2018 Conference

Acknowledgements

Funding program

This project has been funded by grants to G. Motz et al. from the Museums for America program (MA-30-16-0458-16) of the Institute for Museum and Library Services and the Advancing the Digitization of Biodiversity Collection program of the National Science Foundation (NSF DBI 1702289).

This research was supported in part by Lilly Endowment, Inc., through its support for the Indiana University Pervasive Technology Institute, and in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative at IU was also supported in part by Lilly Endowment, Inc.

This work was supported in part by Shared University Research grants from IBM, Inc., to Indiana University.

Abstract

Keywords

Presenting author

Presented at

Acknowledgements

Funding program

Grant title

Hosting institution

Ethics and security

Author contributions

Conflicts of interest

References

Supplementary material