63urn:lsid:arphahub.com:pub:0E0032F4-55AE-5263-8B3C-F4DD637C30C2Biodiversity Information Science and StandardsBISS2535-0897Pensoft Publishers10.3897/biss.4.590745907414606Conference AbstractPD01 - Avenues into integration: communicating taxonomic intelligence from sender to recipientThe Automated Taxonomic Concept ReasonerSenAtriyaaccounts@atriyasen.com1FranzNicohttps://orcid.org/0000-0001-7089-70181SternerBeckett W.https://orcid.org/0000-0001-5219-76161UphamNatehttps://orcid.org/0000-0001-5412-93421Arizona State University, Tempe, United States of AmericaArizona State UniversityTempeUnited States of America
Corresponding author: Atriya Sen (accounts@atriyasen.com).
Academic editor:
2020300920204e5907414156DA2-149D-5BBC-BCDB-A9A903FF40DF28092020Atriya Sen, Nico Franz, Beckett W. Sterner, Nate UphamThis is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
We present a visual and interactive taxonomic Artificial Intelligence (AI) tool, the Automated Taxonomic Concept Reasoner (ATCR), whose graphical web interface is under development and will also become available via an Application Programming Interface (API). The tool employs automated reasoning (Beeson 2014) to align multiple taxonomies visually, in a web browser, using user or expert-provided taxonomic articulations, i.e. "Region Connection Calculus (RCC-5) relationships between taxonomic concepts, provided in a specific logical language (Fig. 1). It does this by representing the problem of taxonomic alignment under these constraints in terms of logical inference, while performing these inferences computationally and leveraging the powerful Microsoft Z3 Satisfiability Modulo Theory (SMT) solver (de Moura and Bjørner 2008). This tool represents further development of utilities for the taxonomic concept approach, which fundamentally addresses the challenge of robust biodiversity data aggregation in light of multiple conflicting sources (and source classifications) from which primary biodiversity data almost invariably originate. The approach has proven superior to aggregation, based just on the syntax and semantics provided by the Darwin Core standardFranz and Sterner 2018).
Fig. 1 provides an artificial example of such an alignment. Two taxonomies, A and B, are shown. There are five taxonomic concepts, A.One, A.Two, A.Three, B.One and B.Two. A.Two and A.Three are sub-concepts (children) of A.One, and B.Two is a sub-concept (child) of B.One. These are represented by the direction of the grey arrows. The undirected mustard-coloured lines represent relationships, i.e., the articulations referred to in the previous paragraph. These may be of five kinds: congruent (==), includes (<) and included in (>), overlap (><), and disjointness. These five relationships are known in the AI literature as the Region Connection Calculus-5 (RCC-5) (Randell et al. 1992, Bennett 1994, Bennett 1994), and taken exclusively and in conjunction with each other, have certain desirable properties with respect to the representation of spatial relationships. The provided relationship (i.e. the articulation) may also be an arbitrary disjunction of these five fundamental kinds, thus allowing for representation of some degree of logical uncertainty. Then, and under three assumptions that:
"sibling" concepts are disjoint in their instances,
all instances of a parent concept are instances of at least one of its child concepts, and
every concept has at least one instance - the SMT-based automated reasoner is able to deduce the relationships represented by the undirected green lines. It is also able to deduce disjunctive relationships where these are logically implied.
ATCR is related to Euler/X (Franz et al. 2015), an existing tool for the same kinds of taxonomic alignment problems, which was used, for example, to obtain an alignment of two influential primate classifications (Franz et al. 2016). It differs from Euler/X in that it employs a different logical encoding that enables more efficient and more informative computational reasoning, and also in that it provides a graphical web interface, which Euler/X does not.
automated reasoningartificial intelligencetaxonomic intelligencecomputational systematicsbioinformaticsbiodiversity informatics2020TDWG 2020 annual conferenceTDWG 2020A Virtual ConferenceTDWG 2020 will be a virtual conference divided into working sessions (Sep 21-25) followed by a second week dedicated to dissemination and sharing (Oct 19-23).Presenting author
Atriya Sen
Presented at
TDWG 2020
ReferencesBeesonMichael J.2014Larry Wos, Ross Overbeek, Ewing Lusk, and Jim Boyle. Automated reasoning. Introduction and applications. Prentice-Hall, Inc., Englewood Cliffs, N.J., 1984, xiv + 482 pp.51246446510.1017/s0022481200031340BennettBrandon1994Spatial Reasoning with Propositional Logics516210.1016/b978-1-4832-1452-8.50102-0de MouraLeonardoBjørnerNikolaj2008Z3: An Efficient SMT Solver33734010.1007/978-3-540-78800-3_24FranzNico M.ChenMingminYuShizhuoKianmajdParisaBowersShawnLudäscherBertram2015Reasoning over Taxonomic Change: Exploring Alignments for the Perelleschus Use Case10210.1371/journal.pone.0118247FranzNico MPierNaomi MReederDeeann MChenMingminYuShizhuoKianmajdParisaBowersShawnLudäscherBertram2016Two Influential Primate Classifications Logically Aligned.6545618210.1093/sysbio/syw023FranzNico MSternerBeckett W2018To increase trust, change the social design behind aggregated biodiversity data201810.1093/database/bax100RandellD. A.CuiZ.CohnA. G.1992A spatial logic based on regions and connectionNebelB.SwartoutW.RichC.Morgan KaufmannLos Altos165-17615FDB283-3053-5811-A29C-2C3CBE3AB673
An example of the Automated Taxonomic Concept Reasoner (ATCR) web interface and functionality. Shown are two input taxonomies (A, B) with three and two entailed concept regions, respectively. Each of these stands for a taxonomic concept as recognized and delimited by the respective source. The grey arrows symbolize given parent-child relationships within each input taxonomy. Green lines show user-specified input RCC-5 articulations. Mustard-coloured lines show logically contingent, reasoner-inferred articulations. The example is logically consistent; if it were not, then no mustard-coloured lines would be visualized.