#RetroPIDs: The missing link to the foundation of biodiversity knowledge
expand article infoNicole Kearney‡,§,|, Colleen Funkhouser¶,#, Mike Lichtenberg, Bess Missell#, Roderic Page¤, Joel Richard#, Diane Rielinger«, Susan Lynch»
The Biodiversity Heritage Library (BHL) will soon upload its 60 millionth page of open access biodiversity literature onto the BHL website and the BHL's Internet Archive Collection. The BHL’s massive repository of free knowledge includes content that is available nowhere else online, as well as accessible versions of content that are locked behind paywalls elsewhere. If we are to continue to expand our understanding of life on Earth, we must ensure that the foundation of biodiversity knowledge provided by BHL is discoverable by the tools we rely on to navigate the ever-expanding internet. These tools – search engines and their algorithms – preferentially deliver (and rank) content with good metadata and persistent identifiers (PIDs).

In modern online publishing, PID assignment and linking happens at the point of publication: DOIs (Digital Object Identifiers) for publications, ORCIDs (Open Researcher and Contributor IDs) for people, and RORs (Research Organization Registry IDs) for organisations. The DOI system provided by Crossref (the DOI registration agency for scholarly content) delivers reciprocal citations, enabling convenient clicking from article to article, and citation tracking, enabling authors and institutions to track the impact and reach of their research output. Publications that lack PIDs, which include the vast majority of legacy literature, are hard to find and sit outside the linked network of scholarly research. This makes it nearly impossible to determine whether they are being cited, let alone viewed, mentioned, shared or liked.

At TDWG 2020, Page 2020, Kearney 2020, Richard 2020 (and 2019, Page 2019b, Page 2019a, Kearney 2019b, Kearney 2019a and 2018, Kearney 2018), we emphasised the need to bring the historic biodiversity literature into the modern linked network of scholarly research. In October 2020, BHL launched a new working group to do exactly this. The BHL Persistent Identifier Working Group (Team #RetroPID) brings together expertise from across BHL’s global community. Over the past year, we have worked tirelessly to make it easier to find, cite, link, share and track the content on BHL, adding article-level metadata to journals and retrospectively assigning DOIs (#RetroPIDs). Most importantly, we have developed the tools and documentation that will enable the entire BHL community to take contributed content from “just” accessible to persistently discoverable.

This paper will detail our efforts to retrofit the historic literature (a square peg) into the modern PID system (a round hole) and will present both the achievements and the challenges of this important work.


Nicole Kearney

Presented at

TDWG 2021