Biodiversity Information Science and Standards :
Conference Abstract
|
Corresponding author: Yasin Bakış (yasinbakis@gmail.com)
Received: 16 Sep 2022 | Published: 16 Sep 2022
© 2022 Yasin Bakış, Bahadır Altıntaş, Xiaojun Wang, M. Maruf, Anuj Karpatne, Henry Bart
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Bakış Y, Altıntaş B, Wang X, Maruf M, Karpatne A, Bart H (2022) Extracting Landmark and Trait Information from Segmented Digital Specimen Images Generated by Artificial Neural Networks. Biodiversity Information Science and Standards 6: e94955. https://doi.org/10.3897/biss.6.94955
|
We have been successfully developing Artificial Intelligence (AI) models for automatically classifying fish species using neural networks over the last three years during the “Biology Guided Neural Network” (BGNN) project*
In this current study, we have focused on extracting morphological characters that are relying on anatomical features of fish, such as location of the eye, body length, and area of the head. We developed a schematic workflow to describe how we processed the data and extract the information (Fig.
Workflow for extracting landmark and trait information from segmented images and score calculations. (STDDEV stands for Standard Deviation).
Segmented images, metadata and species lists were given as input to the workflow. During the cleaning and filtering subroutines, a subset of data was created by filtering down to the desired segmented images with corresponding metadata. In the validation step, segmented images were checked by comparing the number of specimens in the original image to the separate bounding-boxed specimen images, noting: violations in the segmentations, counts of segments, comparisons of relative positions of the segments among one another, traces of batch effect; comparisons according to their size and shape and finally based on these validation criteria each segmented image was assigned a score from 1 to 5 similar to Adobe XMP Basic namespace.
The landmarks and the traits to be used in the study were extracted from the current literature, while mindful that some of the features may not be extracted successfully computationally. By using the landmark list, landmarks have been extracted by adapting the descriptions from the literature on to the segments, such as picking the left most point on the head as the tip of snout and top left point on the pelvic fin as base of the pelvic fin. These 2D vectors (coordinates), are then fine tuned by adjusting their positions to be on the outline of the fish, since most of the landmarks are located on the outline. Procrustes analysis*
Our work on extraction of features from segmented digital specimen images has shown that the accuracy of the traits such as measurements, areas, and angles depends on the accuracy of the landmarks. Accuracy of the landmarks is highly dependent on segmentation of the parts of the specimen. The landmarks that are located on the outline of the body (combination of head and trunk segments of the fish) are found to be more accurate comparing to the landmarks that represents inner features such as mouth and pectoral fin in some taxonomic groups. However, eye location is almost always accurate, since it is based on the centroid of the eye segment. In the remaining part of this study we will improve the score calculation for segments, images, landmarks and traits and calculate the accuracy of the scores by comparing the statistical results obtained by analysis of the landmark and trait data.
machine learning, morphometrics, fish, digitized biocollections
Yasin Bakış
TDWG 2022
BGNN project is funded by the United States National Science Foundation with award number OAC1940322 and Imageomics project is funded by the United States National Science Foundation with award number OAC2118240.
The original data used in this study is provided by Great Lakes Invasives Network Project and protected under the license CC BY-NC 3.0.
HB lead the data harvesting and trait extraction part of the project, AK lead the segmentation part, YB and XW harvested the data, built cyberinfrastructure, created metadata, MM provided the segmented images, YB and BA created the trait extraction workflow and extracted the traits.