Biodiversity Information Science and Standards : Conference Abstract
Print
Conference Abstract
Making Heterogeneous Specimen Data ‘FAIR’: Implementing a digital specimen repository
expand article infoAbraham Nieva de la Hidalga, Alex Hardisty§
‡ Cardiff University School of Computer Science and Informatics, Cardiff, United Kingdom
§ Cardiff University, Cardiff, United Kingdom
Open Access

Abstract

The definition of a digital specimen is proposed to encompass the digital representation(s) of physical specimens from natural science collections. The digital specimen concept is intended to define a representation (digital object) that brings together an array of heterogeneous data types, which are themselves alternative physical specimen representations. In this case, the digital specimen (DS) holds references to specimen data from a collection management system, images, 3D models, research articles, DNA sequences, collector information, among many other data types. The proposal is to create persistent relationships between the DS and other categories of digital objects (e.g. resource types mentioned above, collections, storage platforms, organisations, databases, and provenance data). Complying with FAIR data principles (findability, accessibility, interoperability, and reuse), i.e., achieving data ‘FAIRness’, eases data integration, which is needed for cross-disciplinary linking and combination of data from different domains, making the DS as a comprehensive package of information about a specimen.

Implementation and access to a digital specimen repository (DSR) as a Digital Object Architecture (Sharp 2016) component demonstrates the alignment of the DS concept and FAIR data principles (Wilkinson et al. 2016, Kahn and Wilensky 2006). The DSR fulfills four roles: data producer, resource manager, data publisher, and collaboration space. As data producer, the DSR allows acquisition and curation (indexing, storage) of DSs linking primary data, models, analyses, and other digital object types. As resource manager, the DSR manages access to distributed platforms, ranging from acquisition networks (digitisation stations, museums, herbariums) to processing services, advanced computational resources, data asset storage systems, and specialised servers. As data publisher, the DSR provides access to data assets from national and transnational data archives. As collaboration space, the DSR supports users’ accessing, sharing and (re)using data assets, and derived data products and services. Adopting the collaboration space and data publisher roles, the DSR implements interfaces that expose the DSs to the research community, fulfilling the FAIR findability, accessibility, and reuse principles. Adopting the data producer and resource manager roles, the DSR creates meaningful and persistent relationships required to link DSs and other types of digital objects, fulfilling the FAIR interoperability principle.

A prototype DSR based on the Cordra digital object repository has been deployed (Corporation for National Research Initiatives (CNRI) 2018, Reilly and Tupelo-Schneck 2010). The advantages of Cordra are: rapid deployment, customisable object model, creation of relations between digital objects, and application program interfaces for programmatic access. Rapid deployment of the DSR provides a tangible target for discussing the implementation of the DS concept. The customisable object model enables the refinement and enhancing of the definition of DS in response to feedback from colleagues who have accessed the DSR and used its contents. Creating relations between digital objects enables flexible linking to digital objects stored in different repositories. Accessing the DSR programmatically through APIs enables extending the use of the repository in different platforms (e.g. mobile devices) as well as integration with other repositories and services. As well as supporting a HTTP-oriented API, Cordra implements Digital Object Interface Protocol (DONA Foundation 2018), allowing the definition of operations to act directly on selected DSs in the repository.

The DSR prototype has been demonstrated by providing access to the repository administrative interface and with a custom interface designed to facilitate access by different user groups, such as collection curators, researchers, teachers, and students. The client interface has been designed to demonstrate a subset of the functionalities derived from user stories, which describe software features from the end-user perspective. Demonstrating the DSR capabilities as proposed, will inform the refinement of the design of the DS model and provide early feedback about the needed software features.

Keywords

digital specimen repository, digital specimen, natural history collection, digitisation, FAIR

Presenting author

Abraham Nieva de la Hidalga

Presented at

Biodiversity_Next 2019

Funding program

Horizon 2020 Framework Programme of the European Union

Grant title

ICEDIG – “Innovation and consolidation for large scale digitisation of natural heritage” H2020-INFRADEV-2016-2017 – Grant Agreement No. 777483

Hosting institution

Cardiff University

References

login to comment