Workflow of trait extraction from figures from literature. Figures from PDFs are extracted using pdf2figures. This results in images and xml files of their captions. We then extract trait terms and species names for the Byrozoa ontology, which then feeds into Phenoscape to build trait presence-absence matrices. The extracted images are fed into the machine-learning programmes DeepBryo and ML-morph to automatically annotate images while maintaining metadata from the figure caption.

 
  Part of: Girón Duque JC, Balk M, Dahdul W, Lapp H, Mikó I, Alhajjar E, Wynd B, Tarasov S, Lawrence C, Khakurel B, Porto A, Yan L, E Fluck I, Porto D, Keating J, Borokini I, Seltmann K, Montanaro G, Mabee P (2024) Meeting Report for the Phenoscape TraitFest 2023 with Comments on Organising Interdisciplinary Meetings. Biodiversity Information Science and Standards 8: e115232. https://doi.org/10.3897/biss.8.115232