Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Patrick Ruch (patrick.ruch@sib.swiss)
Received: 25 Aug 2023 | Published: 28 Aug 2023
© 2023 Emilie Pasche, Donat Agosti, Lyubomir Penev, Quentin Groom, Alexandre Flament, Julien Gobeill, Patrick Ruch
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Pasche E, Agosti D, Penev L, Groom Q, Flament A, Gobeill J, Ruch P (2023) Towards "Biodiversity PMC". Biodiversity Information Science and Standards 7: e111647. https://doi.org/10.3897/biss.7.111647
|
The Swiss Institute of Bioinformatics (SIB) Literature services (SIBiLS,
In the course of the BiCIKL project, SIBiLS started indexing a larger set of biodiversity-related contents in the broad sense including environmental sciences and ecology, to build a new literature database called "Biodiversity PMC". In addition to MEDLINE and PubMed Central, SIBiLS is now providing a unique entry point to half a million taxonomic treatments extracted by Plazi, as well as to a growing set of full-text article XMLs from Pensoft, which were not included into the original PubMed Central. The services can be accessed via a new Graphic User Inteface and an OpenAPI. In addition to usual search operators (using the Apache Lucene syntax), the contents are normalized using a large collection of life sciences terminologies and ontologies. Each instance of a term (or its synonym) is normalized with a unique accession number to support a semantically richer search experience. Of particular interest for the biodiversity communities, SIBiLS contents are normalized using ENVO (Environmental Ontology). Further, taxonomic names are normalized using both the NCBI Taxonomy and the Open Tree of Life, which include names from the Catalogue of Life. The resulting data graph contains 12 billion normalized descriptors and supports access via keyword search, as well as via an original question answering interface, which can help provide new perspectives when navigating the life and health sciences. The data (Journal Publishing Tag Set, JATS, and BioC) are fully available under CC-BY 4.0 licences.
literature services, information retrieval, named entity recognition, question answering interface
Patrick Ruch
TDWG 2023
This project receives funding from the European Union's Horizon 2020 Research and Innovation Action under grant agreement No 101007492 (BICIKL).
SIB Swiss Institute of Bioinformatics & HES-SO