Biodiversity Information Science and Standards : Standards
Corresponding author: Quentin Groom (
Academic editor: Gail Kampmeier
Received: 08 Jul 2019 | Accepted: 28 Sep 2019 | Published: 10 Oct 2019
This is an open access article distributed under the terms of the CC0 Public Domain Dedication.
Citation: Groom Q, Desmet P, Reyserhove L, Adriaens T, Oldoni D, Vanderhoeven S, Baskauf SJ, Chapman A, McGeoch M, Walls R, Wieczorek J, Wilson JR.U, Zermoglio PFF, Simpson A (2019) Improving Darwin Core for research and management of alien species. Biodiversity Information Science and Standards 3: e38084.
To improve the suitability of the Darwin Core standard for the research and management of alien species, the standard needs to express the native status of organisms, how well established they are and how they came to occupy a location. To facilitate this, we propose:
1. To adopt a controlled vocabulary for the existing Darwin Core term dwc:establishmentMeans
2. To elevate the pathway term from the Invasive Species Pathways extension to become a new Darwin Core term dwc:pathway maintained as part of the Darwin Core standard
3. To adopt a new Darwin Core term dwc:degreeOfEstablishment with an associated controlled vocabulary
These changes to the standard will allow users to clearly state whether an occurrence of a species is native to a location or not, how it got there (pathway), and to what extent the species has become a permanent feature of the location. By improving Darwin Core for capturing and sharing these data, we aim to improve the quality of occurrence and checklist data in general and to increase the number of potential uses of these data.
establishment means, invasive species, non-native, biodiversity, data standards, Essential Biodiversity Variables, invasion pathway, invasion stage
To improve the management and reduce the spread of alien species, data are needed on an ongoing basis on the current occurrences of those species, their statuses, how they are spreading and where they originated (
Occurrences of biodiversity, including alien species, are primarily communicated using the Darwin Core (dwc) standard, notably by the Global Biodiversity Information Facility (GBIF). Darwin Core standard is a collection of terms and definitions that describe taxa and their occurrence in nature (
If alien species monitoring and research are to be made routine and reliable then data collection needs to be standardized and data handling and aggregation must be automated. Therefore, standards and formats need to converge to capture relevant information and simplify this process, or, at least, there should be an overall framework onto which the current multitude of structures and values can be mapped.
Improved data interoperability would accelerate the process of biodiversity monitoring, reduce the time to produce actionable evidence, and also reduce the costs. In addition to monitoring invasive species, similar situations exist in, for example, the assessment of conservation status (
Basic pieces of information are required for risk assessment, horizon scanning, species management and monitoring. In previous work, we identified four species properties that are needed. These are the introduction pathway, the degree of establishment, the species status and the impact mechanism (
Although impact mechanism was identified as important in all these studies, it is not treated here, because it is derived information about many aspects of the organism’s biology and thus not generally included in original occurrence records. Therefore, we focus on the introduction pathway, the degree of establishment and the species status. We also divide species status into two concepts, firstly whether the taxon is present or absent and secondly whether the taxon is native or alien (non-native). It should also be noted that the term "invasive species" is a source of confusion. In the biological sense, it refers to any species that is rapidly extending its range. However, its definition from a political perspective, notably in the Convention on Biological Diversity, restricts the term to those alien species that may have a negative impact (
A recurrent issue when considering alien species data types is their scope. An invasion is ultimately a population-level phenomenon. A species can be classified as introduced to a particular region only if individuals have been brought in and are present outside of their native range. If such individuals reproduce and spread, then the population (or populations) in that locality may be considered “invasive”. This means that from the perspective of a particular country, there might be both alien and native populations of a species present. Furthermore, this issue of scope also pertains to time, as populations can expand, shrink, become extinct and be reintroduced at different periods.
For any given place and period of time, we need basic information to answer at least the following four questions (cf.
Information to answer these questions is frequently collected in species checklists and occurrence observations datasets, and published to GBIF using appropriate standard terms in the Darwin Core (
Many Darwin Core terms were created to describe the details of biological specimens (e.g. dwc:sex). Specimens frequently consist of all or part of a single organism from a single location on a single collection event, sometimes referred to as a gathering. Darwin Core terms usually also perform well when applied to field observations, though the application of certain terms becomes more difficult, when field observations and some specimens consist of multiple individuals. In recent years, Darwin Core has become more frequently used for ecological survey data (
Darwin Core provides the essential elements of an observation, however there are several extensions that have been created to expand the data that can be incorporated (e.g.
In the following section, details of the proposed changes to Darwin Core are explained.
Currently, dwc:establishmentMeans is defined in the Darwin Core documentation as “The process by which the biological individual(s) represented in the Occurrence became established at the location.” (
The vocabulary recommended by GBIF for dwc:establishmentMeans includes the categories and subcategories in Table
A proposed controlled vocabulary for dwc:establishmentMeans based on the vocabularies used by GBIF and the International Union for Conservation of Nature (IUCN) to express whether a species is native or alien. Hierarchical levels are indicated with colons, synonyms are in parentheses. Appropriate URIs will be assigned upon adoption of the controlled vocabulary.
GBIF establishmentMeans |
IUCN origin |
Proposed human readable label for establishmentMeans |
Proposed controlled value string for establishmentMeans |
native (indigenous, reintroduced) |
native |
native (indigenous) |
native |
reintroduced |
native: reintroduced |
nativeReintroduced |
introduced (exotic, alien) |
introduced |
introduced (alien, exotic, non-native, nonindigenous) |
introduced |
introduced: naturalised |
introduced: invasive |
introduced: managed (cultivative, captive) |
assisted colonisation |
introduced: assisted colonisation |
introducedAssistedColonisation |
vagrant |
vagrant (casual) |
vagrant |
uncertain (unknown) |
origin uncertain |
uncertain (unknown, cryptogenic) |
uncertain |
Unlike many fields in Darwin Core, GBIF encourages conformity in the field establishmentMeans by flagging records as "distribution invalid" if the value is not in the GBIF vocabulary for this term. GBIF also uses a lookup dictionary to interpret some unambiguous values for values found in the vocabulary (Suppl. material
The term dwc:establishmentMeans is well entrenched in the biodiversity informatics community and is widely used and validated (e.g.
As dwc:establishmentMeans and its vocabulary are frequently used, deprecating it would either result in confusion or be ignored by the community. A more helpful approach is to maintain backward compatibility of the use of dwc:establishmentMeans, while augmenting the vocabulary with additional terms, deprecating redundent terms and providing an additional Darwin Core term to express the degree to which a taxon is established. Preexisting data in GBIF with an establishmentMeans of "naturalised", "invasive" or "managed" could be mapped to the term proposed below, degreeOfEstablishment.
“A statement about whether an organism or organisms have been introduced to a given place and time through the direct or indirect activity of modern humans.”
The concept of nativeness is fluid and depends upon the temporal, taxonomic and geographic perspective. We refer to modern humans here to avoid defining nativeness within the definition of dwc:establishmentMeans, but also to acknowledge that these terms refer to comparatively recent biogeographic changes.
The dwc:occurrenceStatus is defined in the Darwin Core standard as “A statement about the presence or absence of a Taxon at a Location” (
This term helps us answer our question as to whether an organism occurs in a defined location and time frame. To express the absence of an dwc:Organism, dwc:occurrenceStatus should only be used where there are defined temporal and spatial boundaries. An assertion of absence has no meaning or use for specimens or point observations where presence is explicit (
Nevertheless, presence and absence are particularly useful when bounded by a time period and location. As absence can never be proven, it can only ever be derived from a reasoned analysis of the evidence, and this has to be bounded. Darwin Core terms suitable for establishing these limits are found under categories Event (e.g. dwc:eventDate) and Location (e.g. dwc:country).
dwc:occurrenceStatus is a useful term because combined with dwc:establishmentMeans, dwc:occurrenceStatus allows the user to express whether an organism is native or alien to an area and whether it still exists there. Yet currently, dwc:occurenceStatus is not universally used on GBIF or it is mistakenly used to express different types of information, such as the breeding status of birds or the IUCN threat status of the organism. For breeding status, the term dwc:reproductiveCondition is more appropriate, and for threat status the term "threatStatus" is available in the Species Distribution extension ( Darwin Core extensions have been created to provide additional functionality for specific communities and to allow more experimentation with terms outside the formal governance of the standard (
Pathways are the means by which invasive species surmount the biogeographic barriers to dispersal and are introduced into new places. Some of these pathways are literal pathways to introduction, such as waterways and bridges, while others are figurative pathways, such as agricultural and trading practices. It is also worth noting that multiple alien and native species are dispersed through individual pathways, though this is less evident in the case of native species where individuals arrive at a destination where their species is already present. Even if a species has already established, policies to eliminate its pathway stop other species from using the same route to introduction. Therefore, improved information on introduction pathway informs policy on trade, agriculture and environmental management (
The species introduction term "pathway" is only available through the Invasive Species Pathways extension to Darwin Core ( However, we argue that this knowledge is so fundamental to biodiversity information that it needs to be part of the Darwin Core standard, classified under the class Occurrence, as a term dwc:pathway. It should also be added to the Species Distribution extension so that it can be used in taxon-based checklists. Pathway information is not only relevant to alien species, but to any taxon, native or alien.
“The process by which an Organism came to be in a given place at a given time.”
A summary of the pathways categorisation scheme reproduced with permission from
In the current and proposed vocabulary for dwc:establishmentMeans there is the explicit recognition that the occurrence of an organism can be either temporary or established. A bird may be blown off course and occur fleetingly in an area, or a seedling may germinate in an unsuitable place only to be killed a few weeks later by the conditions in that habitat, such as frost or drought. Likewise there are those organisms that are so well established that they reproduce and increase in range. Between these two extremes are different degrees of establishment. In this middle ground there are those organisms that persist in a location with no reproduction, others that reproduce, but do not have a significant population increase, and others that might reach high local densities but do not spread. There are, in essence, different routes to commonness (
Currently, Darwin Core lacks an independent term to express degree of establishment. The closest term is "invasiveness" from the Invasive Species Distribution extension, but it has a limited vocabulary and, because it is restricted to invasive species, is of finite use. The vocabulary consists of the four terms, invasive, notInvasive, uncertain and unspecified and was created by the IUCN Species Survival Commission Invasive Species Specialist Group (
In the case of introduced organisms,
Proposed controlled vocabulary for dwc:degreeOfEstablishment adapted from
Important Note: The definition of an invasive species by the Convention on Biological Diversity (and others) is restricted to those species that may cause economic or environmental harm or adversely affect human health. We use the term invasive here in the broader biological sense of the word.
category |
definition |
Proposed label and controlled value string |
A |
Not transported beyond limits of native range |
native |
B1 |
Individuals in captivity or quarantine (i.e. individuals provided with conditions suitable for them, but explicit measures of containment are in place) |
captive |
B2 |
Individuals in cultivation (i.e. individuals provided with conditions suitable for them, but explicit measures to prevent dispersal are limited at best) |
cultivated |
B3 |
Individuals directly released into novel environment |
released |
C0 |
Individuals released outside of captivity or cultivation in a location, but incapable of surviving for a significant period |
failing |
C1 |
Individuals surviving outside of captivity or cultivation in a location, no reproduction |
casual |
C2 |
Individuals surviving outside of captivity or cultivation in a location, reproduction is occurring, but population not self-sustaining |
reproducing |
C3 |
Individuals surviving outside of captivity or cultivation in a location, reproduction occurring, and population self-sustaining |
established |
D1 |
Self-sustaining population outside of captivity or cultivation, with individuals surviving a significant distance from the original point of introduction |
colonising |
D2 |
Self-sustaining population outside of captivity or cultivation, with individuals surviving and reproducing a significant distance from the original point of introduction |
invasive |
E |
Fully invasive species, with individuals dispersing, surviving and reproducing at multiple sites across a greater or lesser spectrum of habitats and extent of occurrence |
widespreadInvasive |
It is recognised that the scheme of
The degreeOfEstablishment term and its suggested vocabulary are proposed to be added to the Darwin Core standard, classified under the class Occurrence.
“The degree to which an Organism survives, reproduces, and expands its range at the given place and time.”
These proposed changes to Darwin Core have been tested on, and informed by, real data. Below are three examples where we have used these terms and vocabularies in datasets published to GBIF. A zoological example has also been published by
The Manual of the Alien Plants of Belgium is a regularly updated checklist of all of the non-indigenous plants that have been found in Belgium, including those that have subsequently become extinct and those that only casually occur there (
Example data from the Manual of Alien Plants of Belgium (
Taxon |
M/I |
FR |
fl |
br |
wa |
D/N |
V/I |
Sambucus canadensis L. |
D |
1972 |
2017 |
X |
Cas. |
Hort. |
Verbesina alternifolia (L.) Britton |
D |
1984 |
N? |
X |
Nat.? |
Hort. |
Hornungia procumbens (L.) Hayek |
A |
<1850 |
<1850 |
? |
? |
? |
Cas. |
? |
Bothriochloa ischaemum (L.) Keng |
A |
1813 |
1916 |
X |
X |
Ext. |
Wool, Ore |
Each entry for the Manual of Alien Plants of Belgium describes the existence of one non-native taxon in Belgium. It gives information on the species introduction status over a period of time, from the first year that it was recorded to the present day. It also gives regional information within Belgium. To convert this into a Darwin Core Archive checklist, a taxon file is created with one record for each entry in the checklist (Table
The relevant Darwin Core Archive taxon core created from the Manual of Alien Plants of Belgium data in Table
taxonID |
scientificName |
alien-plants-belgium:taxon:03206f4a769c6649658ab96839e8a016 |
Sambucus canadensis L. |
alien-plants-belgium:taxon:318b79c7d62889c229128c57e61973c7 |
Verbesina alternifolia (L.) Britton |
alien-plants-belgium:taxon:b27d5b74783b9add7bd6747773e91fab |
Hornungia procumbens (L.) Hayek |
alien-plants-belgium:taxon:fe1d6bc47b13c9123410610d893a17cb |
Bothriochloa ischaemum (L.) Keng |
The relevant Darwin Core distribution extension fields created from the Manual of Alien Plants of Belgium in Table
taxonID |
locality |
occurrence- Status |
establish-mentMeans |
eventDate |
pathway |
degreeOf-Establishment |
alien-plants-belgium:taxon: 03206f4a769c6649658ab96839e8a016 |
Flemish Region |
present |
introduced |
1972/2017 |
horticulture |
casual |
alien-plants-belgium:taxon: 03206f4a769c6649658ab96839e8a016 |
Belgium |
present |
introduced |
1972/2017 |
horticulture |
casual |
alien-plants-belgium:taxon: 318b79c7d62889c229128c57e61973c7 |
Flemish Region |
present |
introduced |
1984/2018 |
horticulture |
established |
alien-plants-belgium:taxon: 318b79c7d62889c229128c57e61973c7 |
Belgium |
present |
introduced |
1984/2018 |
horticulture |
established |
alien-plants-belgium:taxon: b27d5b74783b9add7bd6747773e91fab |
Flemish Region |
doubtful |
introduced |
casual |
alien-plants-belgium:taxon: b27d5b74783b9add7bd6747773e91fab |
Walloon Region |
doubtful |
introduced |
casual |
alien-plants-belgium:taxon: b27d5b74783b9add7bd6747773e91fab |
Brussels-Capital Region |
doubtful |
introduced |
casual |
alien-plants-belgium:taxon: b27d5b74783b9add7bd6747773e91fab |
Belgium |
doubtful |
introduced |
casual |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Flemish Region |
present |
introduced |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Walloon Region |
present |
introduced |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Belgium |
present |
introduced |
1813/1916 |
contaminant OnAnimals| containerBulk |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Flemish Region |
absent |
introduced |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Walloon Region |
absent |
introduced |
alien-plants-belgium:taxon: fe1d6bc47b13c9123410610d893a17cb |
Belgium |
absent |
1916/2018 |
In the Manual, the first and last observation date refer to Belgium as a whole, but there is no information on the first and last observations for Flanders, Wallonia and Brussels. We had the option of either supplying no temporal boundaries for these entries or providing the same dates as for Belgium as a whole. We concluded that it was better not to provide dates for Flanders, Wallonia and Brussels, rather than give misleading information.
The Catalogue of the Rust Fungi of Belgium is a static checklist published in print in 2009 (
The relevant Darwin Core Archive taxon core and terms created from the Catalogue of the Rust Fungi of Belgium in Fig.
taxonID |
scientificName |
uredinales-belgium-checklist:taxon:e82e5bb9f24dc198819ebfc25068ae51 |
Frommeëlla mexicana (Mains) J.W. McCain & J.F. Hennen |
uredinales-belgium-checklist:taxon:8b039e480746ec727316c1ad56ed8759 |
Uromyces croci Pass. |
uredinales-belgium-checklist:taxon:437376fa8fa57a92cfb2ab61d4b093f1 |
Duchesnea indica (Jacks.) Focke |
uredinales-belgium-checklist:taxon:8867819f38b85d4669981ee9e32c9851 |
Crocus biflorus Mill. |
uredinales-belgium-checklist:taxon:df3c9aaaf6c930d84f6a4073a6a01e7b |
Puccinia argentata (Schultz) G. Winter |
uredinales-belgium-checklist:taxon:0c7f30a0959d9f5fcb53e63454e9957a |
Adoxa moschatellina L. |
The relevant Darwin Core distribution extension and terms created from the Catalogue of the Rust Fungi of Belgium in Fig.
taxonID |
locality |
occurrenceStatus |
establishmentMeans |
eventDate |
uredinales-belgium-checklist:taxon:e82e5bb9f24dc198819ebfc25068ae51 |
Belgium |
present |
introduced |
2007-06-08/2007-06-12 |
uredinales-belgium-checklist:taxon:8b039e480746ec727316c1ad56ed8759 |
Belgium |
doubtful |
introduced |
1876/1876 |
uredinales-belgium-checklist:taxon:df3c9aaaf6c930d84f6a4073a6a01e7b |
Belgium |
present |
native |
1898-08/1995-04-30 |
The relevant Darwin Core Archive resourceRelationship extension terms were created from the Catalogue of the Rust Fungi of Belgium in Fig.
resourceID |
relatedResourceID |
relationshipOfResource |
uredinales-belgium-checklist:taxon: 437376fa8fa57a92cfb2ab61d4b093f1 |
uredinales-belgium-checklist:taxon: e82e5bb9f24dc198819ebfc25068ae51 |
parasite of |
uredinales-belgium-checklist:taxon: 8867819f38b85d4669981ee9e32c9851 |
uredinales-belgium-checklist:taxon: 8b039e480746ec727316c1ad56ed8759 |
parasite of |
uredinales-belgium-checklist:taxon: 0c7f30a0959d9f5fcb53e63454e9957a |
uredinales-belgium-checklist:taxon: df3c9aaaf6c930d84f6a4073a6a01e7b |
parasite of |
These are observations based upon those from
Examples of how the proposed vocabularies could be used with observations of native and alien species. These are single observations taken from survey events of a 1km2 grid square made over several hours on a single day. Full occurrence data, including the dates and coordinates, are avaiable from
occurrenceID | scientificName | basisOfRecord | establishment-Means | occurrence-Status | pathway | degreeOf-Establishment |
2cd4p9h.24p5hq | Aesculus hippocastanum L. | HUMAN_OBSERVATION | introduced | present | ornamentalNon-Horticulture | cultivated |
2cd4p9h.7bt1vc | Cerastium fontanum Baumg. | HUMAN_OBSERVATION | native | present | native | |
2cd4p9h.7qp79k | Cochlearia danica L. | HUMAN_OBSERVATION | introduced | present | naturalDispersal | invasive |
2cd4p9h.75ycnf | Heracleum mantegazzianum Sommier & Levier | HUMAN_OBSERVATION | introduced | present | horticulture |
invasive |
2cd4p9h.7bt1ea | Oxalis acetosella L. | HUMAN_OBSERVATION | native | present | native | |
2cd4p9h.amdvmg | Pinus sylvestris L. | HUMAN_OBSERVATION | native |
present |
forestry | released |
2cd4p9h.83f16f | Rhododendron ponticum L. | HUMAN_OBSERVATION |
introduced |
present | horticulture | established |
2cd4p9h.62bx7w | Sanicula europaea L. | HUMAN_OBSERVATION | native | present | native | |
2cd4p9h.b2ncby | Solanum lycopersicum L. | HUMAN_OBSERVATION | vagrant | present | foodContaminant | casual |
We have reviewed the definition and controlled vocabulary of the existing Darwin Core term dwc:establishmentMeans. Though its current definition and vocabulary present some difficulties for use, we feel that it is best to retain it as a term in Darwin Core, but provide a more precise definition and update the vocabulary. This will allow data to be backwardly compatible and to better answer a broader range of questions.
We have also proposed the creation of the term dwc:pathway in Darwin Core rather than use the non-standard term "pathway" from the in-development Invasive Species Pathways extension. This will make the term mainstream and expands its use to taxa beyond invasive species. It also will allow us to better track how humans are altering the distribution of many organisms. Finally, we propose the new term dwc:degreeOfEstablishment to answer the question of how well established a taxon is at a given time and place, and we propose a controlled vocabulary for this term. These proposals are summarized in Table
A summary of proposed Darwin Core changes.
Term |
Proposals for term |
Proposal for vocabulary |
dwc:establishmentMeans |
Retain term and refine definition (table 1) |
Update vocabulary |
dwc:pathway |
Promote pathway term in Invasive Species Pathways extension to the Darwin Core standard, classified under the class Occurrence |
Maintain current recommended vocabulary |
dwc:degreeOfEstablishment |
Add the term to the Darwin Core standard, classified under the class Occurrence |
Adopt a modified vocabulary based on |
To explain how these proposed changes to Darwin Core and its extensions can improve data sharing in the invasive species community, we also presented three use cases where sharing data through GBIF could be simplified by implementing these proposed changes to Darwin Core.
These proposals have emerged from several years discussion in a number of fora and we are grateful to all those who have taken part. Some of these are mentioned below.
Alien Challenge COST Action European Information System for Alien Species - WG4: Data standardisation and harmonisation: Chuck Bargeron, Ana Cristina Cardoso, Niki Chartosia, Fabio Crocetta, Keith Douce, Anna Gazda, Milka Glavendekic, Alberto Inghilesi, Jana Medvecka, Jan Pergl, Olivera Petrovic-Obradovic, Jodey Peyton, Gareth Richards, Helen Roy, Elena Tricarico & Katharine Turvey.
GBIF - Task Group on Data Fitness for Use in Research on Invasive Alien Species: Shyama Pagad, Varos Petrosyan, Gregory Ruiz & Dmitry Schigel.
Biodiversity Information Standards Meetings (2016–2018): Lee Belbin, Matthew Blissett, Dimitry Brosens, Pier Luigi Buttigieg, Robert Guralnick, Niels Klazenga, Joel Sachs & Aaron Wilton
Funded under the Belgian Science Policies Brain program, contract number BR/165/A1/TrIAS. Quentin Groom also acknowledges the Fonds Wetenschappelijk Onderzoek – Vlaanderen for the travel support it gave. The work of Peter Desmet, Lien Reyserhove and Damiano Oldoni is partially funded by Research Foundation - Flanders (FWO) as part of the Belgian contribution to LifeWatch.
Distinct values for dwc:establishmentMeans and their frequency from observations on the Global Biodiversity Information Facility on 27 February 2017. Taken from GitHub repository of the Darwin Core Questions & Answers Site (
A tab-delimited file mapping values (synonyms; orthographic and language variations) found in Darwin Core dwc:establishmentMeans to a controlled vocabulary.
The Convention on Biological Diversity pathway vocabulary adapted from Harrower et al. 2017. Including proposed simple labels for these terms.