Fd-CasBGRel: A Joint Entity–Relationship Extraction Model for Aquatic Disease Domains

Author:

Ye Hongbao123,Lv Lijian12,Zhou Chengquan23,Sun Dawei23

Affiliation:

1. College of Mathematics and Computer Science, Zhejiang A&F University, 666 Wusu Street, Hangzhou 311300, China

2. Agricultural Equipment Research Institute, Zhejiang Academy of Agricultural Sciences, 298 Desheng Middle Road, Hangzhou 310021, China

3. Key Laboratory of Agricultural Equipment in Southeast Hilly and Mountainous Areas of the Ministry of Agriculture and Rural Affairs (Ministry-Province Joint Construction), 298 Desheng Middle Road, Hangzhou 310021, China

Abstract

Entity–relationship extraction plays a pivotal role in the construction of domain knowledge graphs. For the aquatic disease domain, however, this relationship extraction is a formidable task because of overlapping relationships, data specialization, limited feature fusion, and imbalanced data samples, which significantly weaken the extraction’s performance. To tackle these challenges, this study leverages published books and aquatic disease websites as data sources to compile a text corpus, establish datasets, and then propose the Fd-CasBGRel model specifically tailored to the aquatic disease domain. The model uses the Casrel cascading binary tagging framework to address relationship overlap; utilizes task fine-tuning for better performance on aquatic disease data; trains on specialized aquatic disease corpora to improve adaptability; and integrates the BRC feature fusion module—which incorporates self-attention mechanisms, BiLSTM, relative position encoding, and conditional layer normalization—to leverage entity position and context for enhanced fusion. Further, it replaces the traditional cross-entropy loss function with the GHM loss function to mitigate category imbalance issues. The experimental results indicate that the F1 score of the Fd-CasBGRel on the aquatic disease dataset reached 84.71%, significantly outperforming several benchmark models. This model effectively addresses the challenges of ternary extraction’s low performance caused by high data specialization, insufficient feature integration, and data imbalances. The model achieved the highest F1 score of 86.52% on the overlapping relationship category dataset, demonstrating its robust capability in extracting overlapping data. Furthermore, We also conducted comparative experiments on the publicly available dataset WebNLG, and the model in this paper obtained the best performance metrics compared to the rest of the comparative models, indicating that the model has good generalization ability.

Funder

Key R&D Program of Zhejiang

Agricultural Technology Cooperation Program in Zhejiang Province of China

Publisher

MDPI AG

Reference39 articles.

1. (Economic Daily, 2023). Construction of marine ranching to enrich the ‘blue granary’, Economic Daily, p. 011.

2. Feng, J.W. (Farmers’ Daily, 2022). The Ministry of Agriculture and Rural Affairs held the ‘14th Five-Year‘ Fishery High Quality Development Promotion Meeting, Farmers’ Daily, p. 001.

3. Comparative study on edible rate and protein contribution of aquatic products;Zhu;Chi. Fish Qua Stand.,2021

4. Fensel, D., Şimşek, U., Angele, K., Huaman, E., Kärle, E., Panasiuk, O., Toma, I., Umbrich, J., Wahler, A., and Fensel, D. (2020). Introduction: What is a knowledge graph?. Knowledge Graphs: Methodology, Tools and Selected Use Cases, Springer.

5. Long short-term memory;Hochreiter;Neural Comput.,1997

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3