Biodiversity Information Science and Standards : Conference Abstract
|
Corresponding author: Jennifer Hammock (hammockj@si.edu)
Received: 21 Jul 2019 | Published: 20 Aug 2019
© 2019 Jennifer Hammock, Katja Schulz
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Hammock J, Schulz KS (2019) Trait Data Integration from the Perspective of a Data Aggregator. Biodiversity Information Science and Standards 3: e38411. https://doi.org/10.3897/biss.3.38411
|
The Encyclopedia of Life currently hosts ~8M attribute records for ~400k taxa (March 2019, not including geographic categories, Fig.
Taxonomic coverage of trait categories in eol.org, March, 2019.
To support the aggregation and integration of trait information, data sets should be well structured, properly annotated and free of licensing or contractual restrictions so that they are ‘findable, accessible, interoperable, and reusable’ for both humans and machines (FAIR principles;
Global scale biodiversity data resources should resolve into a graph, linking taxa, specimens, occurrences, attributes, localities, and ecological interactions, as well as human agents, publications and institutions. Two key data categories for ensuring rich connectivity in the graph will be taxonomic and trait data. This graph can be supported by existing data hubs, if they share identifiers and/or create mappings between them, using standards and sharing practices developed by the biodiversity data community. Versioned archives of the combined graph could be published at intervals to appropriate open data repositories, and open source tools and training provided for researchers to access the combined graph of biodiversity knowledge from all sources. To achieve this, good communication among data hubs will be needed. We will need to share information about preferred vocabularies and identifier management practices, and collaborate on identifier mappings.
traits, data integration, graph data, identifiers
Jennifer Hammock
Biodiversity_Next 2019
Smithsonian, National Museum of Natural History