Biodiversity Information Science and Standards : Conference Abstract
|
Corresponding author: Abraham Nieva de la Hidalga (nievadelahidalgaa@cardiff.ac.uk)
Received: 12 Jun 2019 | Published: 18 Jun 2019
© 2019 Abraham Nieva de la Hidalga, Nicolas Cazenave, Donat Agosti, Zhengzhe Wu, Mathias Dillen, Lars Nielsen
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Nieva de la Hidalga A, Cazenave N, Agosti D, Wu Z, Dillen M, Nielsen L (2019) Use of European Open Science Cloud and National e-Infrastructures for the Long-Term Storage of Digitised Assets from Natural History Collections. Biodiversity Information Science and Standards 3: e37164. https://doi.org/10.3897/biss.3.37164
|
Digitisation of Natural History Collections (NHC) has evolved from transcription of specimen catalogues in databases to web portals providing access to data, digital images, and 3D models of specimens. These portals increase global accessibility to specimens and help preserve the physical specimens by reducing their handling. The size of the NHC requires developing high-throughput digitisation workflows, as well as research into novel acquisition systems, image standardisation, curation, preservation, and publishing. Nowadays, herbarium sheet digitisation workflows (and fast digitisation stations) can digitise up to 6,000 specimens per day. Operating those digitisation stations in parallel, can increase the digitisation capacity. The high-resolution images obtained from these specimens, and their volume require substantial bandwidth, and disk space and tapes for storage of original digitised materials, as well as availability of computational processing resources for generating derivatives, information extraction, and publishing. While large institutions have dedicated digitisation teams that manage the whole workflow from acquisition to publishing, other institutions cannot dedicate resources to support all digitisation activities, in particular long-term storage. National and European e-infrastructures can provide an alternative solution by supporting different parts of the digitisation workflows. In the context of the Innovation and consolidation for large scale digitisation of natural heritage (
The EUDAT-CINES pilot centred on transferring large digitised herbarium collections from the National Museum of Natural History France (MNHN) to the storage infrastructure provided by the Centre Informatique National de l’Enseignement Supérieur (
The data models employed in the pilots allow defining data schemas according to the types of collection and specimen images stored. For EUDAT-CINES, data were composed of the specimen data and its business metadata (those the institution making the deposit, in this case MNHN, considers relevant for the data objects being stored), enhanced by archiving metadata, added during the archiving process (institution, licensing, identifiers, project, archiving date, etc). EUDAT uses ePIC identifiers (
The pilot infrastructure design reports describe features, capacities, functions and costs for each model, in three specific contexts are relevant for the implementation of the Distributed Systems of Scientific Collections (
European Open Science Cloud, e-infrastructure, long-term storage, federated cloud, digitisation, natural history collections, ICEDIG, EUDAT, Zenodo
Abraham Nieva de la Hidalga
Biodiversity_Next 2019
Horizon 2020 Framework Programme of the European Union
ICEDIG – “Innovation and consolidation for large scale digitisation of natural heritage” H2020-INFRADEV-2016-2017 – Grant Agreement No. 777483
EUDAT, CINES, Zenodo, Plazi, Botanic Gardens Meise, University of Finland