Automatic generation of a large dictionary with concreteness/abstractness ratings based on a small human dictionary

Author:

Ivanov Vladimir1,Solovyev Valery2

Affiliation:

1. Faculty of Computer Science and Software Engineering, Innopolis University, st. Universitetskaya, 1, Innopolis, Republic of Tatarstan, Russian Federation

2. Linguistic research and education center, Research laboratory ‘Intellectual technologies of text management’, Kazan Federal University, 2, Kazan, the Republic of Tatarstan, Russian Federation

Abstract

Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessments obtained on smaller samples. The research question that arises is how small such samples should be to do a good enough extrapolation. In this paper, we present a method for automatic ranking concreteness of words and propose an approach to significantly decrease amount of expert assessment. The method has been evaluated on a large test set for English. The quality of the constructed dictionaries is comparable to the expert ones. The correlation between predicted and expert ratings is higher comparing to the state-of-the-art methods.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference14 articles.

1. Enriching word vectors with subword information;Bojanowski;Transactionsof the Association for Computational Linguistics,2017

2. Norms of age of acquisition andconcreteness for 30,000 Dutch words;Brysbaert;Acta psychologica,2014

3. Concretenessratings for 40 thousand generally known English word lemmas;Brysbaert;Behavior Research Methods,2014

4. The mrc psycholinguistic database;Coltheart;The Quarterly Journal of Experimental Psychology Section A,1981

5. Dadras Parinaz and Ramezani Majid , Codac: Concreteness degreeauto-calculator of persian words, International Journal of Computer Science and Information Security (IJCSIS) 15(5) (2017).

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Automatic Construction of Sememe Knowledge Bases From Machine Readable Dictionaries;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

2. Concreteness ratings for 36,000 Estonian words;Behavior Research Methods;2023-12-21

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3