Abstract
AbstractFish is one of the most extensive distributed organisms in the world, fish taxonomy is an important part of biodiversity and is also the basis of fishery resources management. However, the morphological characters are so subtle to identify and intact specimens are not available sometimes, making the research and application of morphological method laborious and time-consuming. DNA barcoding based on a fragment of the cytochrome c oxidase subunit I (COI) gene is a valuable molecular tool for species identification and biodiversity studies. In this paper, a novel deep learning classification approach that fuses Elastic Net-Stacked Autoencoder (EN-SAE) with Kernel Density Estimation (KDE), named ESK-model, is proposed bases on DNA barcode. In stage one, ESK-model preprocesses the original data from COI fragments. In stage two, EN-SAE is used to learn the deep features and obtain the outgroup score of each fish. In stage three, KDE is used to select the threshold base on the outgroup scores and classify fish from different families. The effectiveness and superiority of ESK-model have been validated by experiment on three dominant fish families and comparisons with state-of-the-art methods. Those findings confirm that the ESK-model can accurately classify fish from different family base on DNA barcode.
Publisher
Cold Spring Harbor Laboratory