Corresponding author: Paul J. Morris (
Academic editor:
What is a provider (or consumer) of biodiversity data to think when one quality assessment tool asserts that a particular problem exists in their data, while a different tool asserts that this problem is not present? Is there a problem with their data? Is there a problem with one of the tools? The Biodiversity Data Quality Task Group 2 is developing a suite of standardized descriptions of tests (validations, measures, amendments) of biodiversity data, implementations of which would be expected to provide consistent assertions about a particular data set so that input of identical data sets into two different test suite implementations will produce the same results (for some meaning of “the same”).
Development of standard test definitions is a big step in the direction of consistency. More is needed. Clear and detailed specifications for each test will help. For example, data might have suitable quality for global change analysis if collecting dates have a temporal resolution of one year or less. One implementer's test may check if the event date has a duration of 365 days or less, another might account for leap days, another might test if the data can be unambiguously binned into single years. For some data, each implementation will produce different assertions about the record. If the standard test specification states which of these meanings apply, then correct implementations should make identical assertions. To tell, however, if two implementations of a suite of tests will produce the same result for identical inputs we need two things, one is a set of tests (of the tests), the other is an understanding of what it means for results to be the same. It is expected that there will be changes in the results of tests of scientific names over time, and that different authorities will have different opinions about that set of scientific names. One element of “the same” is an expectation that results will be the same when test implementations are run at the same time and with the same configuration, but not necessarily otherwise.
Consider tests at three levels: First, tests of the internals of a test, separate from the fitness for use framework (
Paul J. Morris
ABI
ABI
Collaborative Research: ABI Development: Kurator: A Provenance-enabled Workflow Platform and Toolkit to Curate Biodiversity Data