Intelligent Information extraction algorithm of Agricultural text based on Machine Learning method

Author:

Wang Zihao,Cheng Zilong,Guan Chenzheng,Han Haoxiang

Abstract

Abstract The Internet agricultural technology question and answer platform now only relies on manual to provide answer service, the response speed is slow, and the answer quality is difficult to be guaranteed. In order to realize the intelligent question and answer of agricultural technology and construct the knowledge base of agricultural technology, it is necessary to extract the named entity triple of “crop-pest-pesticide” from the existing question and answer data. There are few researches on agricultural Chinese named entity recognition, and the accuracy is low. According to the characteristics of named entities of crops, diseases and insect pests and pesticides, and according to the question and answer data of agricultural technology, a method of identifying named entities of crops, diseases and pests and pesticides based on conditional random field was proposed. The data set is formatted and segmented automatically, and the corpus after word segmentation is automatically tagged according to whether it contains a specific definition word, whether it contains a specific partial radical, whether it is a quantifier, whether it is a specific left and right definition word and part of speech. Using the tagged data to train the CRF model, we can classify the corpus, including judging whether the corpus belongs to crop, pest and pesticide named entities and identifying the position of the corpus in the compound named entity, thus realizing the recognition of the three kinds of named entities and automatically constructing the associated triple. Through the experiment to select the feature combination and adjust the context window size, the recognition accuracy of this method is improved, the model training time is reduced, and the accuracy of crop, pest and pesticide named entity recognition is 97.72%, 87.63% and 98.05%, respectively, which is significantly higher than the existing methods.

Publisher

IOP Publishing

Subject

General Physics and Astronomy

Reference7 articles.

1. Research on the Development Model of Agricultural Electronic Commerce based on Internet + [J];Lianjun;Agricultural Network Information,2015

2. an overview of named entity recognition [J];Ji;Modern computer,2016

3. Research method of named entity extraction based on Web [J];Aijie,2010

4. Chinese product named entity recognition based on multi-features [J];Tanigawa;Science, Technology and Engineering,2013

5. Named entity recognition for Weibo text [J];Renhui,2014

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Design of Intelligent Recognition English Translation System Based on Machine Learning Algorithm;2022 International Conference on Education, Network and Information Technology (ICENIT);2022-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3