Structural Models of English Terms of Automated Processing of Scientific and Technical Texts Corpora

Author:

Butenko Iuliia I.,Nikolaeva Natalia S.,Kartseva Elena Yu.

Abstract

The article is devoted to the structural models of English multi-component terms from the subject area Welding types as a basis for marking the corpora of scientific and technical texts. The place of corpora of scientific and technical texts in corpus linguistics and prospects of further scientific research based on them are marked. Relevance of the research is conditioned by the necessity to create the corpus of scientific and technical texts, in general, and means of automatic marking of terms, in particular. It has been substantiated that the main problem in creating the corpus of scientific and technical texts is automatic marking of terminological word combinations. The analysis of the current state of the terminology system of the subject area Welding types has been carried out. The formal structure of elements of the Welding types terminology system is considered. The results of the analysis of two, three, four-component English terminological word combinations of the Welding types subject area and their structural models are presented. All structural models of English terminology combinations are illustrated with examples. The most productive models of English terms word combinations are highlighted. It is shown that the most productive model - the combination of a nucleus element with a noun or an adjective in the function of the prepositional definition - can be traced in two-component word combinations, but the analysis of more complex formations shows that the model of left definition attached to the term kernel is also present in them, demonstrating generic features. The necessity of enumerating all possible structural models of terminological combinations in the subject area Welding types has been substantiated. The novelty of the study is seen in the formation of a database of structural models of terminological combinations as the basis of a superstructure database on the structure of terms to improve the quality of automatic marking of the bodies of scientific and technical texts and processing of terms-candidates in the conduct of body studies.

Publisher

Peoples' Friendship University of Russia

Subject

Linguistics and Language,Language and Linguistics

Reference29 articles.

1. Nagel’, O.V. (2008). Corpus linguistics and its use in computer-based language teaching. Language and culture, 4, 53—59. (In Russ.).

2. Zakharov, V.P. (2015). Corpora of the Russian Language. Proceedings of the V.V. Vinogradov Russian Language Institute, 6, 20—65. (In Russ.).

3. Kruzhkov, M.G. (2015). Information resources for contrastive studies: electronic text corpora. Systems and means of informatics, 25(2), 140—159. (In Russ.).

4. Lesnikov, S.V. (2019). The types of marking of text corpora of the Russian language. Scientific and Technical Information. Series 2. Information Processes and Systems, 9, 27—30. (In Russ.).

5. Zakharov, V.P. & Khokhlova, M.V. (2014). Automatic extracting of terminological phrases. Structural and Applied Linguistics, 10, 182—200. (In Russ.).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3