Proceedings of TDWG : Conference Abstract
|
Corresponding author: John Deck (jdeck88@gmail.com)
Received: 11 Aug 2017 | Published: 11 Aug 2017
© 2017 John Deck, Brian Stucky, Ramona Walls, Rodney Ewing, Melissa Genazzio, Henry Loescher, Robert Guralnick
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Deck J, Stucky B, Walls R, Ewing R, Genazzio M, Loescher H, Guralnick R (2017) A High-throughput Data Ingest Pipeline for Semantic Data-stores. Proceedings of TDWG 1: e20208. https://doi.org/10.3897/tdwgproceedings.1.20208
|
|
Ontologies offer multiple benefits for biodiversity data processing and analysis, including precisely defined vocabularies, robust pathways for data integration, and support for automated machine reasoning. However, ontologies have yet to be widely deployed for biodiversity data processing and analysis. Reasons for this include: specialized skills and coordination are needed for mapping terms to source data, data processing and machine reasoning are computationally expensive, and there is a scarcity of tools for working with ontologies and RDF triples. In this presentation we will discuss a data processing pipeline (available at https://github.com/biocodellc/ppo-data-pipeline) which simplifies complex implementation tasks, offers tools for data ingest, triplifying, and reasoning, and makes datasets available for indexing.
Ontology, Pipeline, Workflow, Data Integration
John Deck
TDWG 2017