|
Biodiversity Information Science and Standards :
Conference Abstract
|
|
Corresponding author: Stanley Blum (stanblum@gmail.com)
Received: 15 Nov 2024 | Published: 15 Nov 2024
© 2024 Stanley Blum
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Blum S (2024) Use Cases Help to Identify Primary Concepts in Biodiversity Information Modeling. Biodiversity Information Science and Standards 8: e141876. https://doi.org/10.3897/biss.8.141876
|
|
“Use cases” are “a methodology used in system analysis to identify, clarify and organize system requirements” (
By 2008, the TDWG Technical Architecture Group began recommending that TDWG develop its standards in the framework of the semantic web (SW); i.e., RDF. Two efforts have contributed significantly to casting the terms of Darwin Core in RDF.
Darwin-SW included explicit recognition of IndividualOrganism, Occurrence, Event, and Token (i.e., evidence of a dwc:MaterialSample or an observation). The simplest use case, in which an organism is collected or observed once, doesn't require that Organism, Occurrence and MaterialSample/Observation be recognized as separate entities. The relationships are one-to-one-to-one. They can be joined into a single entity with only one identifier (e.g., materialSampleID) without loss of information. Note that a specimen or observation infers the existence of an organism and its occurrence in nature. The use cases that require separating Organism and possibly Occurrence from the MaterialSample or Observation are the ones where an organism is sampled or observed, remains in nature, and is subsequently sampled or observed again; i.e., the organism is the target of more than one dwc:Event. These cases require that the organism can be reliably identified as the same organism encountered earlier; e.g., by tag, identifying marking, DNA fingerprint, or precise and fixed location for sessile organisms.
The BCO (
A question then emerges for original providers who practice only the simplest case: should the provider manufacture dwc:occurrenceID and dwc:organismID even if they aren’t used in the original database? If they are useful to someone outside the local context, should creating redundant identifiers be the responsibility of the provider or the aggregator?
Fig.
Alternative entity relationship diagrams representing the concepts in Darwin-SW under discussion here. Primary (PK) and foreign (FK) keys are listed in entities. A) With "Occurrence" represented as the association entity between Organism and Event, producing a MaterialSample. B) With the MaterialSample entity establishing or embodying the relationship between Organism and Event.
The Darwin Core Quick Reference Guide lists 25 properties of the Occurrence class. My contention is that all but a few would be more appropriately assigned to the MaterialSample or Observation. Note that the MaterialSample or Observation represents the Organism at the time of the Event, and can be viewed as the appropriate subject for properties that change over time, e.g., lifeStage and reproductiveCondition. Moreover, others have argued that even permanent features of an Organsim are more correctly represented as having been directly assessed in the MaterialSample/Observation. It allows for contradictory assessments, but accommodating and resolving contradictions are real parts of scientific research. The alternative placements of properties not assigned to MaterialSample/Observation are:
Under the model represented (in part) by Fig.
Darwin Core, occurrence, ontology, Semantic Web
Stanley Blum
SPNHC-TDWG 2024