An automatic approach to identify word sense changes in text media across timescales

Author:

MITRA SUNNY,MITRA RITWIK,MAITY SUMAN KALYAN,RIEDL MARTIN,BIEMANN CHRIS,GOYAL PAWAN,MUKHERJEE ANIMESH

Abstract

AbstractIn this paper, we propose an unsupervised and automated method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books and millions of tweets posted per day. We construct distributional-thesauri-based networks from data at different time points and cluster each of them separately to obtain word-centric sense clusters corresponding to the different time points. Subsequently, we propose a split/join based approach to compare the sense clusters at two different time points to find if there is ‘birth’ of a new sense. The approach also helps us to find if an older sense was ‘split’ into more than one sense or a newer sense has been formed from the ‘join’ of older senses or a particular sense has undergone ‘death’. We use this completely unsupervised approach (a) within the Google books data to identify word sense differences within a media, and (b) across Google books and Twitter data to identify differences in word sense distribution across different media. We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet.

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference40 articles.

1. Aging in Language Dynamics

2. Fellbaum C. (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.

3. Tahmasebi N. , Risse T. , and Dietze S. 2011. Towards automatic language evolution tracking: a study on word sense tracking. In Proceedings of EvoDyn, vol. 784, Bonn, Germany.

Cited by 18 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. An embedded diachronic sense change model with a case study from ancient Greek;Computational Statistics & Data Analysis;2024-11

2. Semantic micro-dynamics as a reflex of occurrence frequency: a semantic networks approach;Cognitive Linguistics;2023-08-01

3. Temporal word embedding with predictive capability;Knowledge and Information Systems;2023-07-13

4. Text Comparison Based on Semantic Similarity;2023 3rd International Conference on Intelligent Technologies (CONIT);2023-06-23

5. LL(O)D and NLP perspectives on semantic change for humanities research;Semantic Web;2022-09-26

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3