Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: José Augusto Salim (joseasalim@usp.br)
Received: 01 Oct 2020 | Published: 02 Oct 2020
© 2020 José Augusto Salim, Antonio Saraiva
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Salim JA, Saraiva AM (2020) A Google Sheet Add-on for Biodiversity Data Standardization and Sharing. Biodiversity Information Science and Standards 4: e59228. https://doi.org/10.3897/biss.4.59228
|
For those biologists and biodiversity data managers who are unfamiliar with information science data practices of data standardization, the use of complex software to assist in the creation of standardized datasets can be a barrier to sharing data.
Since the ratification of the Darwin Core Standard (DwC) (
In order to provide a more "familiar" approach to data sharing using DwC-A, we introduce a new tool as a Google Sheet Add-on. The Add-on, called Darwin Core Archive Assistant Add-on can be installed in the user's Google Account from the G Suite MarketPlace and used in conjunction with the Google Sheets application.
The Add-on assists the mapping of spreadsheet columns/fields to DwC terms (Fig.
An example of mapping sheet columns/fields to Darwin Core terms using the Darwin Core Archive Assistant Add-on.
An example of how to create a star schema using the Darwin Core Archive Assistant Add-on: the row in the left sheet (the core sheet) with CORE_ID 2787 is linked to the two rows of the right sheet (the extension sheet), creating a one-to-many relation between sheets, following the DwC-A nomenclature.
Generating a Darwin Core Archive using the Darwin Core Archive Assistant Add-on: users have to select the "Core Sheet" and the Row Types (Darwin Core classes recognized by GBIF as cores) of each sheet (core and extensions).
We expect that the Google Sheet Add-on introduced here, in conjunction with IPT, will promote biodiversity data sharing in a standardized format, as it requires minimal training and simplifies the process of data sharing from the user's perspective, mainly for those users not familiar with IPT, but that historically have worked with spreadsheets. Although the DwC-A generated by the add-on still needs to be published using IPT, it does provide a simpler interface (i.e., spreadsheet) for mapping data sets to DwC than IPT. Even though the IPT includes many more features than the Darwin Core Assistant Add-on, we expect that the Add-on can be a "starting point" for users unfamiliar with biodiversity informatics before they move on to more advanced data publishing tools. On the other hand, Zenodo integration allows users to share and cite their standardized data sets without publishing them via IPT, which can be useful for users without access to an IPT installation. Additionally, we are working on new features and future releases will include the automatic generation of Global Unique Identifiers for shared records, the possibility of adding additional data standards and DwC extensions, integration with GBIF REST API and with IPT REST API.
data sharing tool, Darwin Core, Darwin Core Archive, biodiversity informatics, spreadsheet
José Augusto Salim
TDWG 2020