Biodiversity Information Science and Standards : Conference Abstract
|
Corresponding author: Leonor Venceslau (fc50637@alunos.fc.ul.pt)
Received: 14 Jun 2019 | Published: 10 Jul 2019
© 2019 Leonor Venceslau, Luis Lopes
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Venceslau L, Lopes LF (2019) Comparison of Automated Georeferencing Tools Using Insect Collection Data. Biodiversity Information Science and Standards 3: e37345. https://doi.org/10.3897/biss.3.37345
|
Major efforts are being made to digitize natural history collections to make these data available online for retrieval and analysis (
Georeferencing is a time-consuming process requiring manual validation; as such, a significant part of all natural history collection data available online are not georeferenced. Of the 161 million records of preserved specimens currently available in the Global Biodiversity Information Facility (GBIF), only 86 million (53.4%) include coordinates. It is therefore important to develop and optimize automatic tools that allow a fast and accurate georeferencing.
The objective of this work was to test existing automatic georeferencing services and evaluate their potential to accelerate georeferencing of large collection datasets. For this end, several open-source georeferencing services are currently available, which provide an application programming interface (API) for batch georeferencing. We evaluated five programs: Google Maps, MapQuest, GeoNames, OpenStreetMap, and GEOLocate. A test dataset of 100 records (reference dataset), which had been previously individually georreferenced following
Of the five programs tested, Google Maps yielded the most results (99) and was the most accurate with 57 results < 1000 m from the reference location and 79 within the uncertainty radius. GEOLocate provided results for 87 locations, of which 47 were within 1000 m of the correct location, and 57 were within the uncertainty radius. The other 3 services tested all had less than 35 results within 1000 m from the reference location, and less than 50 results within the uncertainty radius. Google Maps and Open Street Map had the lowest average distance from the reference location, both around 5500 m. Google Maps has a usage limit of around 40000 free georeferencing requests per month, beyond which the service is paid, while GEOLocate is free with no usage limit. For large collections, this may be a factor to take into account.
In the future, we hope to optimize these methods and test them with larger datasets.
georeferencing, data digitization, natural history collections
Luis Filipe Lopes
Biodiversity_Next 2019
FCT for funds to GHTM – UID/Multi/04413/2013
FCT for funds to CE3C - UID/BIA/00329/2013
Museu Nacional de História Natural e da Ciência, Universidade de Lisboa, Lisboa, Portugal