Biodiversity Information Science and Standards : Conference Abstract
|
Corresponding author: Gary J Motz (garymotz@indiana.edu)
Received: 09 Apr 2018 | Published: 05 Jul 2018
© 2018 Gary Motz, Alexander Zimmerman, Kimberly Cook, Alyssa Bancroft
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Motz G, Zimmerman A, Cook K, Bancroft A (2018) Collections Management and High-Throughput Digitization using Distributed Cyberinfrastructure Resources. Biodiversity Information Science and Standards 2: e25643. https://doi.org/10.3897/biss.2.25643
|
|
Collections digitization relies increasingly upon computational and data management resources that occasionally exceed the capacity of natural history collections and their managers and curators. Digitization of many tens of thousands of micropaleontological specimen slides, as evidenced by the effort presented here by the Indiana University Paleontology Collection, has been a concerted effort in adherence to the recommended practices of multifaceted aspects of collections management for both physical and digital collections resources. This presentation highlights the contributions of distributed cyberinfrastructure from the National Science Foundation-supported Extreme Science and Engineering Discovery Environment (XSEDE) for web-hosting of collections management system resources and distributed processing of millions of digital images and metadata records of specimens from our collections.
The Indiana University Center for Biological Research Collections is currently hosting its instance of the Specify collections management system (CMS) on a virtual server hosted on Jetstream, the cloud service for on-demand computational resources as provisioned by XSEDE. This web-service allows the CMS to be flexibly hosted on the cloud with additional services that can be provisioned on an as-needed basis for generating and integrating digitized collections objects in both web-friendly and digital preservation contexts. On-demand computing resources can be used for the manipulation of digital images for automated file I/O, scripted renaming of files for adherence to file naming conventions, derivative generation, and backup to our local tape archive for digital disaster preparedness and long-term storage.
Here, we will present our strategies for facilitating reproducible workflows for general collections digitization of the IUPC nomenclatorial types and figured specimens in addition to the gigapixel resolution photographs of our large collection of microfossils using our GIGAmacro system (e.g., this slide of conodonts). We aim to demonstrate the flexibility and nimbleness of cloud computing resources for replicating this, and other, workflows to enhance the findability, accessibility, interoperability, and reproducibility of the data and metadata contained within our collections.
macrophotography, cyberinfrastructure, collections management, microfossil
Gary J Motz
SPNHC+TDWG 2018 Conference
This project has been funded by grants to G. Motz et al. from the Museums for America program (MA-30-16-0458-16) of the Institute for Museum and Library Services and the Advancing the Digitization of Biodiversity Collection program of the National Science Foundation (NSF DBI 1702289).
This research was supported in part by Lilly Endowment, Inc., through its support for the Indiana University Pervasive Technology Institute, and in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative at IU was also supported in part by Lilly Endowment, Inc.
This work was supported in part by Shared University Research grants from IBM, Inc., to Indiana University.
The Otago Museum and the University of Otago