Affiliation:
1. College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
2. College of Information Technology, Shanghai Jian Qiao University, Shanghai 201306, China
Abstract
In addressing the challenges of non-standardization and limited annotation resources in Chinese marine domain texts, particularly with complex entities like long and nested entities in coral reef ecosystem-related texts, existing Named Entity Recognition (NER) methods often fail to capture deep semantic features, leading to inefficiencies and inaccuracies. This study introduces a deep learning model that integrates Bidirectional Encoder Representations from Transformers (BERT), Bidirectional Gated Recurrent Units (BiGRU), and Conditional Random Fields (CRF), enhanced by an attention mechanism, to improve the recognition of complex entity structures. The model utilizes BERT to capture context-relevant character vectors, employs BiGRU to extract global semantic features, incorporates an attention mechanism to focus on key information, and uses CRF to produce optimized label sequences. We constructed a specialized coral reef ecosystem corpus to evaluate the model’s performance through a series of experiments. The results demonstrated that our model achieved an F1 score of 86.54%, significantly outperforming existing methods. The contributions of this research are threefold: (1) We designed an efficient named entity recognition framework for marine domain texts, improving the recognition of long and nested entities. (2) By introducing the attention mechanism, we enhanced the model’s ability to recognize complex entity structures in coral reef ecosystem texts. (3) This work offers new tools and perspectives for marine domain knowledge graph construction and study, laying a foundation for future research. These advancements propel the development of marine domain text analysis technology and provide valuable references for related research fields.
Funder
National Natural Science Foundation of China, the Youth Science Foundation Project
Shanghai Science and Technology Commission part of the local university capacity building projects
Reference40 articles.
1. Coral reefs in the Anthropocene;Hughes;Nature,2017
2. Zhao, D., Lou, Y., Song, W., Huang, D., and Wang, X. (Aquac. Fish., 2023). Stability analysis of reef fish communities based on symbiotic graph model, Aquac. Fish., in press.
3. Chinese named entity recognition: The state of the art;Liu;Neurocomputing,2022
4. Liu, C., Zhang, W., Zhao, Y., Luu, A.T., and Bing, L. (2024). Is translation all you need? A study on solving multilingual tasks with large language models. arXiv.
5. Named entity recognition using hidden Markov model (HMM);Morwal;Int. J. Nat. Lang. Comput.,2012