Automatic Recognition and Extraction of English Verb Types Based on Index Line Clustering

Author:

Zhao Hui12ORCID,Jin Kexin3,Wang Jing3

Affiliation:

1. Office of Development and Planning, Chengdu Vocational & Technical College of Industry, Chengdu 610000, Sichuan, China

2. School of Rail Transit, Chengdu Vocational & Technical College of Industry, Chengdu 610000, Sichuan, China

3. Office of Educational Administration, Chengdu Vocational & Technical College of Industry, Chengdu 610000, Sichuan, China

Abstract

Languages are not uniform and certain words are used differently by speakers of different languages more or less often, or with distinct meanings. In both linguistics and natural language processing (NLP) problems, the classification that groups together verbs and a collection of similar syntactic and semantic features are of great interest. In the modern era of science and technology, NLP technology is developing rapidly. However, the interpretation of index lines still needs to be realized manually. This method takes a long time, especially after entering the era of big data, the number of corpora has increased rapidly and it is normal to have a corpus with hundreds of millions of words. The quantity of text generated every day is increasing intensely and the word index based on search words is as high as tens of thousands of lines, so it is very difficult to analyze index lines manually. Automatic lexical knowledge acquisition is essential for a variety of NLP activities. Particularly knowledge about verbs is critical, which are the major source of relationship information in a sentence. Due to this issue, this study attempts to automatically identify and extract English verbs by index line clustering. Each index behavior can be regarded as microtext automatic clustering to realize the automatic identification and extraction of English verb forms. This study first focuses on the clustering index algorithm including the C-means clustering algorithm and fuzzy C-means clustering algorithm, then describes in detail the automatic recognition and extraction process of English verbs based on index line clustering, and creates a verification set and completes the index line clustering of English verbs. Finally, the effect of index line algorithm is analyzed from two aspects: automatic recognition of English verb types and recall rate. At the same time, the verbs are selected to analyze their types and judge the probability of each type. The experimental results show that the average recognition rate of English verbs in the manual classification is 91.01%, and the average accuracy of automatic recognition and extraction of English verb patterns based on index row clustering is 95.99%.

Publisher

Hindawi Limited

Subject

Computer Networks and Communications,Computer Science Applications

Reference32 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3