63urn:lsid:arphahub.com:pub:0E0032F4-55AE-5263-8B3C-F4DD637C30C2Biodiversity Information Science and StandardsBISS2535-0897Pensoft Publishers10.3897/tdwgproceedings.1.20486204867737Conference AbstractSymposium: Biodiversity Data Quality – concepts, methods and toolsDarwin Cloud: Mapping real-world data to Darwin CoreWieczorekJohntuco@berkeley.edu12MorrisPaul J.mole@morris.net2HankenJames2LoweryDavid B.2LudäscherBertram3MacklinJames44McPhillipsTimothy3MorrisRobert A.52ZhangQian3Museum of Vertebrate Zoology, University of California, Berkeley, United States of AmericaMuseum of Vertebrate Zoology, University of CaliforniaBerkeleyUnited States of AmericaMuseum of Comparative Zoology, Harvard University, Cambridge, MA, United States of AmericaMuseum of Comparative Zoology, Harvard UniversityCambridge, MAUnited States of AmericaUniversity of Illinois Urbana-Champaign, Champaign, United States of AmericaUniversity of Illinois Urbana-ChampaignChampaignUnited States of AmericaAgriculture and Agri-Food Canada, Ottawa, CanadaAgriculture and Agri-Food CanadaOttawaCanadaUniversity of Massachusetts, Boston, Boston, United States of AmericaUniversity of Massachusetts, BostonBostonUnited States of America
Corresponding authors: John Wieczorek (tuco@berkeley.edu), Paul J. Morris (mole@morris.net).
Academic editor:
2017210820171e20486DC54B694-3B3B-5E85-A188-7E317717A0A4114064321082017John Wieczorek, Paul J. Morris, James Hanken, David B. Lowery, Bertram Ludäscher, James Macklin, Timothy McPhillips, Robert A. Morris, Qian ZhangThis is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Since its ratification as a TDWG standard in 2009, data publishers have had to struggle with the essential step of mapping fields in working databases to the terms in Darwin Core Wieczorek et al. 2012 in order to publish and share data using that standard. Doing so requires a good understanding of both the data set and Darwin Core. The accumulated knowledge about these mappings constitutes what we call the "Darwin Cloud." We will explore the nature of data mapping challenges and the potential for semi-automated solutions to them. Specifically, we will look at the "Darwinizer" actor and its usage in related workflows within the Kurator data quality framework and the implications for community-managed vocabularies.
Biodiversity InformaticsDarwin CoreData QualityNational Science Foundation100000001http://doi.org/10.13039/1000000011-6 October 2017TDWG 2017 Annual ConferenceTDWG 2017Ottawa, CanadaData Integration in a Big Data Universe: Associating Occurrences with Genes, Phenotypes, and EnvironmentsPresenting author
John Wieczorek
Funding program
NSF DBI 1356438 and 1356751
ReferencesWieczorekJohnBloomDavidGuralnickRobertBlumStanDöringMarkusGiovanniRenatoRobertsonTimVieglaisDavid2012Darwin Core: An Evolving Community-Developed Biodiversity Data Standard71e29715http://dx.doi.org/10.1371/journal.pone.002971510.1371/journal.pone.0029715