Biodiversity Information Science and Standards : Conference Abstract
Conference Abstract
The Paleo Data Working Group: A model for developing and sustaining a community of practice
expand article infoErica Krimmel, Talia Karim§, Holly Little|, Lindsay J. Walker, Roger Burkhalter#, Christina Byrd¤, Amanda Millhouse|, Jessica Utrup«
‡ iDigBio, Florida State University, Tallahassee, United States of America
§ University of Colorado, Boulder, United States of America
| Smithsonian National Museum of Natural History, Washington, DC, United States of America
¶ Natural History Museum of Los Angeles, Los Angeles, United States of America
# Sam Noble Museum, University of Oklahoma, Norman, United States of America
¤ Museum of Comparative Zoology, Harvard University, Cambridge, United States of America
« Yale Peabody Museum of Natural History, New Haven, United States of America
Open Access


The Paleo Data Working Group was launched in May 2020 as a driving force for broader conversations about paleontologic data standards. Here, we present an overview of the “community of practice” model used by this group to evaluate and implement data standards such as those stewarded by Biodiversity Information Standards (TDWG). A community of practice is defined by regular and ongoing interaction among individual members, who find enough value in participating, so that the group achieves a self-sustaining level of activity (Wenger 1998, Wenger and Snyder 2000, Wenger et al. 2002). Communities of practice are not a new phenomenon in biodiversity science, and were recommended by the recent United States National Academies report on biological collections (National Academies of Sciences, Engineering, and Medicine 2020) as a way to support workforce training, data-driven discoveries, and transdisciplinary collaboration. Our collective aim to digitize specimens and mobilize the data presents new opportunities to foster communities of practice that are circumscribed not by research agendas but rather by the need for better data management practices to facilitate research.

Paleontology collections professionals in the United States have been meeting to discuss digitization semi-consistently in both virtual and in-person spaces for nearly a decade, largely thanks to support from the iDigBio Paleo Digitization Working Group. The need for a community of practice within this group focused on data management in paleo collections became apparent at the biodiversity_next Conference in October 2019, where we realized that work being done in the biodiversity standards community was not being informed by or filtering back to digitization and data mobilization efforts occurring in the paleo collections community. A virtual workshop focused on georeferencing for paleo in April 2020 was conceived as an initial pathway to bridge these two communities and provided a concrete example of how useful it can be to interweave practical digitization experience with conceptual data standards.

In May 2020, the Paleo Data Working Group began meeting biweekly on Zoom, with discussion topics collaboratively developed, presented, and discussed by members and supplemented with invited speakers when appropriate. Topics centered on implementation of data standards (e.g., Darwin Core) by collections staff, and how standards can evolve to better represent data. An associated Slack channel facilitated continuing conversations asynchronously. Engaging domain experts (e.g., paleo collections staff) in the conceptualization of information throughout the data lifecycle helped to pinpoint issues and gaps within the existing standards and revealed opportunities for increasing accessibility. Additionally, when domain experts gained a better understanding of the information science framework underlying the data standards they were better able to apply them to their own data. This critical step of standards implementation at the collections level has often been slow to follow standards development, except in the few collections that have the funds and/or expertise to do so. Overall, we found the Paleo Data Working Group model of knowledge sharing to be mutually beneficial for standards developers and collections professionals, and it has led to a community of practice where informatics and paleo domain expertise intersect with a low barrier to entry for new members of both groups.

Serving as a loosely organized voice for the needs of the paleo collections community, the Paleo Data Working Group has contributed to several initiatives in the broader biodiversity community. For example, during the 2021 public review of Darwin Core maintenance proposals, the Paleo Data Working Group shared the workload of evaluating and commenting on issues among its members. Not only was this efficient for us, but it was also effective for the TDWG review process, which sought to engage a broad audience while also reaching consensus. The Paleo Data Working Group has also served as a coordinated point of contact for adjacent and intersecting activities related to both data standards (e.g., those led by the TDWG Earth Sciences and Paleobiology Interest Group and the TDWG Collections Description Interest Group) and paleontological research (e.g., those led by the Paleobiology Database and the Integrative Paleobotany Portal project).

Sustaining activities, like those of the Paleo Data Working Group, require consideration and regular attention. Support staff at iDigBio and collections staff focusing on digitization or data projects at their own institutions, as well as a consistent pool of drop-in and occasional participants, have been instrumental in maintaining momentum for the community of practice. Socializing can also help build the personal relationships necessary for maintaining momentum. To this extent, the Paleo Data Working Group Slack encourages friendly banter (e.g., the #pets-of-paleo channel), more general collections-related conversations (e.g., the #physical-space channel), and space for those with sub-interests to connect (e.g., the #morphology channel). While the focus of the group is on data, on an individual level, our group members find it useful to network on a wide variety of topics and this usefulness is critical to sustaining the community of practice.

As we look forward to Digital Extended Specimen concepts and exciting developments in cyberinfrastructure for biodiversity data, communities of practice like that exemplified by the Paleo Data Working Group are essential for success. Creating FAIR (Findable, Accessible, Interoperable and Reusable) data requires buy-in from data providers, such as those in the paleo collections community. Even beyond FAIR, considering CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) data means embracing participation from a broad spectrum of perspectives, including those without informatics experience. Here, we provide insight into one model for creating such buy-in and participation.


natural history collection, paleontology, fossil, digitization, data mobilization

Presenting author

Erica Krimmel

Presented at

TDWG 2021


login to comment