Biodiversity Information Science and Standards : Conference Abstract
Print
Conference Abstract
Data Quality – Whose Responsibility is it?
expand article info Arthur D. Chapman
‡ Australian Biodiversity Information Services, Ballan, Australia
Open Access

Abstract

The quality of biodiversity data is an on-going issue. Early efforts to improve quality go back at least 4 decades, but it has never risen to the level of importance that it should have. For far too long the push to database more and more data regardless of its quality has taken priority. So I pose the question - what is the use of having lots of data if 1) we don’t know what its quality is, and 2) if much of it is not fit for use?

When data-basing of herbarium and museum collections began in the 1970s many taxonomists saw the only use of the data as being for taxonomic purposes. But as more and more data has become digitally available, so too the uses to which the data can be put. It has also become increasingly important that the data we have in our herbaria and museums be put to more uses to justify on-going support and funding.

But whose responsibility is data quality? To answer that I take you to general data quality principles – i.e. that the difficulty and the cost of improving the quality of the data increases the further you move from its source. Responsibility for data quality rests with everyone.

  • Collectors of the specimens
  • Database designers and builders
  • Data entry operators
  • Data curators and managers
  • Those responsible for exchanging/exporting the data
  • Data aggregators
  • Data publishers
  • Data users

We all have responsibilities.

So, what can we each do to play our part? We need to work together at all levels of the data chain. We need to develop systems whereby feedback on quality from wherever it comes can be documented and fed back. It is no use continually making corrections to the data down the line if those corrections never get back to the data curators and data custodians. It is also of little use if the information fed back goes nowhere and nothing is done with it.

The TDWG Data Quality Interest Group is working on setting up standards and tools to help make this possible. We have developed a Framework for Data Quality, we have developed a set of core tests for data quality, and assertions for feeding information back to custodians and forward to users and is beginning a process to deal with vocabularies of value for biodiversity data.

Keywords

biodiversity, data entry, aggregators, publishers, data users, vocabularies, data quality

Presenting author

Arthur D. Chapman

login to comment