All the Clades in the World: Building a Semantically-Rich and Testable Ontology of Phylogenetic Clade Definitions

Author:

Vaidya GauravORCID,Zhang GuanyangORCID,Lapp HilmarORCID,Cellinese Nico

Abstract

Taxonomic names are ambiguous as identifiers of biodiversity data, as they refer to a particular concept of a taxon in an expert’s mind (Kennedy et al. 2005). This ambiguity is particularly problematic when attempting to reconcile taxonomic names from disparate sources with clades on a phylogeny. Currently, such reconciliation requires expert interpretation, which is necessarily subjective, difficult to reproduce, and refractory to scaling. In contrast, phylogenetic clade definitions are a well-developed method for unambiguously defining the semantics of a clade concept in terms of shared evolutionary ancestry (Queiroz and Gauthier 1990, Queiroz and Gauthier 1994), and these semantics allow locating clades on any phylogeny. Although a few software tools have been created for resolving clade definitions, including for definitions expressed in the Mathematical Markup Language (e.g. Names on Nodes in Keesey 2007) and as lists of GenBank accession numbers (e.g. mor in Hibbett et al. 2005), these are application-specific representations that do not provide formal definitions with well-defined semantics for every component of a clade definition. Being able to create such machine-interpretable definitions would allow computers to store, compare, distribute and resolve semantically-rich clade definitions. To this end, the Phyloreferencing project (http://phyloref.org, Cellinese and Lapp 2015) is working on a specification for encoding phylogenetic clade definitions as ontologies using the Web Ontology Language (OWL in W3C OWL Working Group 2012). Our specification allows the semantics of these definitions, which we call phyloreferences, to be described in terms of shared ancestor and excluded lineage properties. The aim of this effort is to allow any OWL-DL reasoner to resolve phyloreferences on a phylogeny that has itself been translated into a compatible OWL representation. We have developed a workflow that allows us to curate phyloreferences from phylogenetic clade definitions published in natural language, and to resolve the curated phyloreference against the phylogeny upon which the definition was originally created, allowing us to validate that the phyloreference reflects the authors’ original intent. We have started work on curating dozens of phyloreferences from publications and the clade definition database RegNum (http://phyloregnum.org), which will provide an online catalog of all clade definitions that are part of the Phylonym Volume, to be published together with the PhyloCode (https://www.ohio.edu/phylocode/). We will comprehensively curate these definitions into a reusable and fully computable ontology of phyloreferences. In our presentation, we will provide an overview of phyloreferencing and will describe the model and workflow we use to encode clade definitions in OWL, based on concepts and terms taken from the Comparative Data Analysis Ontology (Prosdocimi et al. 2009), Darwin-SW (Baskauf and Webb 2016) and Darwin Core (Wieczorek et al. 2012). We will demonstrate how phyloreferences can be visualized, resolved and tested on the phylogeny that they were originally described on, and how they resolve on one of the largest synthetic phylogenies available, the Open Tree of Life (Hinchliff et al. 2015). We will conclude with a discussion of the problems we faced in referring to taxonomic units in phylogenies, which is one of the key challenges in enabling better integration of phylogenetic information into biodiversity analyses.

Funder

National Science Foundation

Publisher

Pensoft Publishers

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3