Affiliation:
1. Key Laboratory of Animal Biodiversity Conservation and Integrated Pest Management (Chinese Academy of Sciences), Institute of Zoology Chinese Academy of Sciences Beijing China
2. Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology Chinese Academy of Sciences Beijing China
3. University of Chinese Academy of Sciences Beijing China
4. Cangzhou Normal University Cangzhou Hebei Province China
5. Northeast Asia Biodiversity Research Centre Northeast Forestry University Harbin China
Abstract
Abstract
Digitalized natural history collections serve as vital ecological and evolutionary research resources. Specimen retrieval based on morphological features allows for the rapid acquisition of similar specimens from these collections, aiding in maximizing the utilization of their resources and catering to the requirements of related research. However, achieving this objective requires effective feature extraction and representation techniques.
We developed a phenotype encoding network (PENet), a deep learning‐based model that combines hashing methods to automatically extract and encode discriminative features into hash codes.
We evaluated the performance of PENet on six data sets, including a newly constructed beetle data set (6566 images), which covers over 60% of the genera within the six subfamilies of Scarabaeidae. Phenotype encoding network showed high performance in feature extraction and image retrieval, allowing users to input an image of a specimen and efficiently retrieve all specimens with similar morphology. Two visualization methods, t‐SNE and Grad‐CAM, were used to evaluate the representation ability of the hash codes. Additionally, by using the hash codes generated from PENet, a phenetic distance tree was constructed based on the beetle data set. The result indicated that the hash codes could reveal the phenetic distances and relationships among categories to a certain extent.
PENet provides an automatic and efficient method to extract and represent morphological discriminative features. The generated hash code can be used as a low‐dimensional carrier of these features, enabling efficient specimen retrieval. Moreover, the distance information carried by these hash codes suggests their potential in systematics, deserving further exploration.
Funder
China Postdoctoral Science Foundation
National Natural Science Foundation of China
Subject
Ecological Modeling,Ecology, Evolution, Behavior and Systematics