Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Michael Dodd (michael.dodd@open.ac.uk)
Received: 07 Sep 2022 | Published: 07 Sep 2022
© 2022 Michael Dodd
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Dodd M (2022) How the Citizen Science Platform iSpot Ensures Data Accuracy During and After Collection. Biodiversity Information Science and Standards 6: e94578. https://doi.org/10.3897/biss.6.94578
|
The iSpot citizen science platform has been collecting biodiversity data since 2009 and includes ~900,000 observations, about half of these are in the British Isles. Our system to ensuring data accuracy, especially of species identification, uses metadata uploaded with photographs for time and sometimes location, a reputation system, an active user community and curation by an experienced ecologist.
Taxon identification/resolution – On iSpot, anyone can enter an observation and anyone can enter an identification (ID) but the ID that becomes ‘likely’ depends on the reputation of the user who provided it, combined with the reputation of other users who agree with it. Users gain reputation by entering IDs that other users with existing reputations agree with and which then become the likely ID. The accuracy of the system has been checked by passing a sample of observations with likely ID to the United Kingdom (UK) national system of verification, where experts in all groups of organisms check the IDs and other aspects of observation records. The observations in Table
The number of iSpot observations verified by national or regional taxonomic experts on the irecord system in all taxonomic groups.
Row Labels | Total |
Accepted | 20495 |
Queried | 109 |
Rejected | 975 |
Observer expertise – In some ways this aspect is less relevant since the ID is often provided by others but still the observer has to give accurate location, time and provide good images. Feedback is given on these aspects by the community, especially to new users of the system.
Clarity and resolution of digital recordings – There is advice on taking suitable images and the community often provides comments and hints for improvement. This is especially an issue with fungi, which often require an image of the underside of the fruiting body as well as overall shots. Some members of the community provide example observations with 10 or more images of the specimen to illustrate all relevant aspects.
Spatial accuracy – From one point of view, we want to leave the observations that are wrongly located to get a measure of the overall accuracy of the dataset. However the community complains if observations are obviously wrongly positioned and not corrected and it is not good to pass on wrongly located data to other organisations. There are articles and forum topics asking users to check their observation locations and suggest how to do this; comments are posted on the wrongly located observations asking for them to be corrected. As a last resort, if the user does not provide a corrected location, then the curator will take this responsibility, moving ~0.2% of observations. Errors range from simply missing a minus sign on the coordinate to making mistakes with mouse or pointer. Some of these can be easily corrected; others require more detective work or are impossible to correct and may be deleted from the system.
To assess the accuracy of locations, the first 1007 observations were selected from a buffer of 3 miles off the United Kingdom coastline (currently there are approximately 4500 observations in this buffer). This area was chosen as it is an area where errors may be easier to spot and where the observations were from all around the coast by many different users. The observations were examined individually, looking at the location name provided and checking if they were within ~2 miles of the coastline (880), greater than 2 miles away (37), or the location name was too ambiguous to tell (90).
The observations classed as greater than 2 miles from where they should be were mapped to see if there were any common issues (Fig.
Red dots are from the observation coordinates and the corresponding green dot from the placename provided. In some cases there are multiple observations at the same place.
Observations from irecord have rarely if ever shown any query over location in terms of name of the location being different from the coordinates.
Temporal accuracy – The system automatically records many aspects of the observation including when it was uploaded but date of upload is often not the date of observation. So the date of observation is specifically asked for. There has not been a lot of checking of dates but via phenology and other aspects of the image or date of species appearance it is very rare for the date of observation to be wrongly recorded.
Community involvement in curation – The community is involved in giving identifications and agreements, and in writing comments on observations and in the forum. They also set up and run projects on particular localities or taxonomic groups, and looking back through all the existing observations in these areas or taxa and checking them all. It is possible for the community to achieve the correct identification but for other aspects such as wrong locations, they try to ask the original observer and if they don’t respond then ask the curator to move the observation to the correct place, if that is possible to deduce. It is important for the curator and ideally programmers to be involved with the community so this is a two way-process.
biodiversity, online community, social media, reputation system, social learning
Michael Dodd
TDWG 2022