Author:
Yu Helong,Wang Chenxi,Xue Mingxuan
Abstract
IntroductionThe diversity of edible fungus species and the extent of mycological knowledge pose significant challenges to the research, cultivation, and popularization of edible fungus. To tackle this challenge, there is an urgent need for a rapid and accurate method of acquiring relevant information. The emergence of question and answer (Q&A) systems has the potential to solve this problem. Named entity recognition (NER) provides the basis for building an intelligent Q&A system for edible fungus. In the field of edible fungus, there is a lack of a publicly available Chinese corpus suitable for use in NER, and conventional methods struggle to capture long-distance dependencies in the NER process.MethodsThis paper describes the establishment of a Chinese corpus in the field of edible fungus and introduces an NER method for edible fungus information based on XLNet and conditional random fields (CRFs). Our approach combines an iterated dilated convolutional neural network (IDCNN) with a CRF. First, leveraging the XLNet model as the foundation, an IDCNN layer is introduced. This layer addresses the limited capacity to capture features across utterances by extending the receptive field of the convolutional kernel. The output of the IDCNN layer is input to the CRF layer, which mitigates any labeling logic errors, resulting in the globally optimal labels for the NER task relating to edible fungus.ResultsExperimental results show that the precision achieved by the proposed model reaches 0.971, with a recall of 0.986 and an F1-score of 0.979.DiscussionThe proposed model outperforms existing approaches in terms of these evaluation metrics, effectively recognizing entities related to edible fungus information and offering methodological support for the construction of knowledge graphs.