Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Samira Babalou (samira.babalou@uni-jena.de)
Received: 23 Aug 2022 | Published: 23 Aug 2022
© 2022 Samira Babalou, David Schellenberger Costa, Helge Bruelheide, Jens Kattge, Christine Römermann, Christian Wirth, Birgitta König-Ries
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Babalou S, Schellenberger Costa D, Bruelheide H, Kattge J, Römermann C, Wirth C, König-Ries B (2022) iKNOW: A platform for knowledge graph construction for biodiversity. Biodiversity Information Science and Standards 6: e93867. https://doi.org/10.3897/biss.6.93867
|
Nowadays, more and more biodiversity datasets containing observational and experimental data are collected and produced by different projects. In order to answer the fundamental questions of biodiversity research, these data need to be integrated for joint analyses. However, to date, too often, these data remain isolated in silos.
Both in academia and industry, Knowledge Graphs (KGs) are widely regarded as a promising approach to overcome issues of data silos and lack of common understanding of data (
We argue that this is at least partially due to the high effort and expertise required in developing and managing such KGs. Therefore, in our ongoing project, iKNOW (
For now, we focus on tabular datasets resulting from biodiversity observation or sampling events or experiments. Given such a dataset, iKNOW will support its transformation into (subject, predicate, object) triples in the RDF standard (Resource Description Framework). Every uploaded dataset will be considered as a subgraph of the main KG in iKNOW. If required, data can be cleaned. After that, the entities and relationships among them should be extracted. For that, a user will be able select one of the existing semi-automatic tools available on our platform (e.g., JenTab (
iKNOW will include a wide range of functionalities for creating, accessing, querying, visualizing, updating, reproducing, and tracking the provenance of KGs. The reproducibility of such a creation is essential to strengthening the establishment of open science practices in the biodiversity domain. Thus, all information regarding the user-selected tools with parameters and settings, along with the initial dataset and intermediate results, will be saved in every step of our platform. With the help of this, users can redo the previous steps. Moreover, this enables us to track the provenance of the created KG.
The iKNOW project is a joint effort by computer scientists and domain experts from the German Centre for Integrative Biodiversity Research (iDiv). As a showcase, we aim to create a KG of plant-related data sources at iDiv. These include, among others: TRY (the plant trait database) (
The resulting KG will serve as a discovery tool for biodiversity data and provide a robust infrastructure for managing biodiversity knowledge. From the biodiversity research perspective, iKNOW will contribute to creating a dataset following the Linked Open Data principles by interlinking to cross-domain and specific-domain KGs. From the computer science perspective, iKNOW will contribute to developing tools for dynamic, low-effort creation of reproducible knowledge graphs.
biodiversity informatics, semantic web, knowledge graph platforms
Samira Babalou
TDWG 2022