Biodiversity Information Science and Standards : Conference Abstract
|
Corresponding author: Corinna Gries (cgries@wisc.edu)
Received: 10 Jun 2019 | Published: 18 Jun 2019
© 2019 Corinna Gries, Mark Servilla, Margaret O'Brien, Kristin Vanderbilt, Colin Smith, Duane Costa, Susanne Grossman-Clarke
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Gries C, Servilla M, O'Brien M, Vanderbilt K, Smith C, Costa D, Grossman-Clarke S (2019) Achieving FAIR Data Principles at the Environmental Data Initiative, the US-LTER Data Repository. Biodiversity Information Science and Standards 3: e37047. https://doi.org/10.3897/biss.3.37047
|
The Environmental Data Initiative (EDI) is a continuation and expansion of the original United Stated Long-Term Ecological Research Program (US-LTER) data repository which went into production in 2013. Building on decades of data management experience in LTER, EDI is addressing the challenge of publishing a diverse corpus of research data (
The FAIR principles serve as benchmarks for EDI’s operation and management: the data we curate are Findable because they reside in an open repository, with unique and persistent digital object identifiers (DOIs) and standard metadata indexed as a searchable resource; they are Accessible through industry standard protocols and are, in most cases, under an open-access license (access control is available if required); Interoperability is achieved by archiving data in commonly used file formats, and both metadata and data are machine readable and accessible; rich, high quality science metadata, with automated congruence and completeness checking, render data fit for Reuse in multiple contexts and environments, along with easily generated data provenance to document their lineage.
The success of this approach is proven by the number and spatial and temporal extent of recent re-analyses and synthesis efforts of these data. Although formal data citations are not yet common practice, a Google Scholar search reveals over 400 journal articles crediting data re-use through an EDI DOI. However, despite improved data availability, researchers still report that the largest time investment in synthesis projects is discovering, cleaning and combining primary datasets until all data are completely understood and converted to a similar format. Starting with long-term biodiversity observation data EDI is addressing this issue by implementing a pre-harmonization of thematically similar data sets. Positioned between the data author’s specific data format and larger biodiversity data stores or synthesis projects, this approach allows uniform access without the loss of ancillary information. This pre-harmonization step may be accomplished by data managers because the dataset still contains all original information without any aggregation or science question specific decisions for data omission or cleaning. The data are still distributed into distinct datasets allowing for asynchronous updating of long-term observations. The addition of specific and standardized metadata makes them easily discoverable.
Long-Term Ecological Research, LTER, data repository, FAIR Data, environmental data, long-term data, data management
Corinna Gries
Biodiversity_Next 2019
This work has been supported by the National Science Foundation grants DBI-1629233 and DBI-1565103.
US National Science Foundation, Advances in Biological Infrastructure
Environmental Data Initiative
University of Wisconsin Madison, University of New Mexico, University of California Santa Barbara
The authors take on different roles in this project but are contributing to its success equally.
None