Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Lyubomir Penev (l.penev@pensoft.net)
Received: 08 Aug 2023 | Published: 09 Aug 2023
© 2023 Lyubomir Penev, Georgi Zhelezov, Mariya Dimitrova, Iva Boyadzhieva, Teodor Georgiev
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Penev L, Zhelezov G, Dimitrova M, Boyadzhieva I, Georgiev T (2023) OpenBiodiv for Users: Applications and Approaches to Explore a Biodiversity Knowledge Graph. Biodiversity Information Science and Standards 7: e110724. https://doi.org/10.3897/biss.7.110724
|
OpenBiodiv is a biodiversity database—knowledge graph based on Resource Description Framework (RDF)—that contains information extracted from the scientific literature. It provides access to an ecosystem of tools and services, including a Linked Open Dataset, an ontology (OpenBiodiv-O) and а website (
Using the available data, OpenBiodiv discovers links between various biodiversity data types (e.g., taxon names, treatments, specimens, sequences, people and institutions), to answer a user’s questions about specific taxa, scientific articles, materials examined and others.
The full-text XML content is converted into Linked Open Data from journals on the ARPHA Publishing Platform and treatments extracted by Plazi’s TreatmentBank (stored in the Biodiversity Literature Repository at Zenodo). The database is updated and indexed daily using a workflow based on the Apache Kafka event-streaming platform. The workflow was developed during the European Union-funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project (
Each semantic statement (e.g., authors, articles, treatments, taxonomic names, localities) has its own globally unique, persistent and resolvable identifier (GUPRI).
There are four ways a user can explore the data on OpenBiodiv:
General search
The search engine is accessible from the OpenBiodiv homepage. The user needs to type in a key term, (e.g., a taxonomic name, authority or an article title), and the system retrieves information about it. Errors caused by misspellings are avoided due to the Elasticsearch index. It can also determine the semantic type of the searched entity.
Application Programing Interface (API)
OpenBiodiv can be used through a RESTful API for programmatic access. The documentation of the API is described on Swagger. The API construction and functionalities follow the recommendations elaborated by the Technical Research Infrastructures forum of the BiCIKL project (
User applications based on a query algorithm
This function can be applied for any data class. The method uses the relationships between an element type (e.g., taxon name) and the type of the section, where it can be found.
An application example is Literature exploration, designed to answer the question: Give me information about X mentioned within article section type Y. The results show the number of mentions of the entity (e.g., taxon name) in the section(s) of interest (e.g., Title, Abstract, Treatment). A click navigates the user to the place in the article that mentions the item (Fig.
Back-linking from the OpenBiodiv Literature exploration result page to the respective entity in the original article, provided through the persistent identifiers in the article full-text XML (after
SPARQL queries in a thematic context
OpenBiodiv provides a SPARQL endpoint through the Ontotext GraphDB solution*
biodiversity informatics, knowledge graph, SPARQL, RDF
Teodor Georgiev
TDWG 2023
The BiCIKL project receives funding from the European Union's Horizon 2020 Research and Innovation Action under grant agreement No 101007492.
BiCIKL - Biodiversity Community Integrated Knowledge Library