What’s Missing From All the Portals?
expand article infoSharon Grant, Kate Webbink, Janeen Jones, Pete Herbst, Robert Zschernitz, Rusty Russell
‡ Field Museum, Chicago, United States of America
At time of writing there are over 784 million occurrence records in the Global Biodiversity Information Facility (GBIF) portal (, 106 million on the iDigBio site (; 68 million in the Atlas of Living Australia ( and 20 million in VertNet ( The list of biodiversity aggregators and portals that boast occurrence counts in the millions continues to increase. Combined with sites who gather data their data from outside of the GBIF domain such as The Paleobiology Database, there is compelling evidence that global digitization is starting to illuminate the black hole of biodiversity data held in collections across the world. The visibility and demands on our collective natural history heritage have never been as high, and they are increasingly in the spotlight with both internal and external audiences. Funding sources have moved away from massive "digitization for the sake of digitization" projects and demand much more focused proposals. To compete in this arena, collections staff and researchers must collaborate and mine collections for their strengths and use those to justify efforts. To do this, however, they must have access to information about the non-digitized occurrence level records in the world’s holdings.

We discuss the potential use of current TDWG standards to allow the capture of existing institutional data about undigitized collections and also those whose records have been marked as environmentally, culturally, or politically sensitive and so must remain digitally dark, so that portals like GBIF can use them in a comparable way as existing occurrence records. Can Darwin Core (with its extensions) together with the Natural Collections Description (draft standard) be used to describe accessions, inventory-level information, and backlog estimates in an efficient and effective way and provide even greater visibility of those undigitized occurrences? In addition, can these data also serve as a means to further refine existing digitized records?


