Short Text Analysis Based on Dual Semantic Extension and Deep Hashing in Microblog

Author:

Cui Wanqiu1ORCID,Du Junping1ORCID,Wang Dawei2,Yuan Xunpu1,Kou Feifei1,Zhou Liyan1,Zhou Nan1

Affiliation:

1. Beijing University of Posts and Telecommunications, Beijing, China

2. Renmin University of China, Beijing, China

Abstract

Short text analysis is a challenging task as far as the sparsity and limitation of semantics. The semantic extension approach learns the meaning of a short text by introducing external knowledge. However, for the randomness of short text descriptions in microblogs, traditional extension methods cannot accurately mine the semantics suitable for the microblog theme. Therefore, we use the prominent and refined hashtag information in microblogs as well as complex social relationships to provide implicit guidance for semantic extension of short text. Specifically, we design a deep hash model based on social and conceptual semantic extension, which consists of dual semantic extension and deep hashing representation. In the extension method, the short text is first conceptualized to achieve the construction of hashtag graph under conceptual space. Then, the associated hashtags are generated by correlation calculation based on the integration of social relationships and concepts to extend the short text. In the deep hash model, we use the semantic hashing model to encode the abundant semantic features and form a compact and meaningful binary encoding. Finally, extensive experiments demonstrate that our method can learn and represent the short texts well by using more meaningful semantic signal. It can effectively enhance and guide the semantic analysis and understanding of short text in microblogs.

Funder

the National Natural Science Foundation of China

the BUPT Excellent Ph.D. Students Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Reference46 articles.

1. The number of terms and documents for pseudo-relevant feedback for ad hoc information retrieval;Amine Abderrahim Mohammed El;Int. J. Comput. Sci. Iss.,2013

2. Effective pseudo-relevance for Microblog retrieval

3. Sequential Query Expansion using Concept Graph

4. Towards Semantic Retrieval of Hashtags in Microblogs

5. Selecting good expansion terms for pseudo-relevance feedback

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Multi-View Scholar Clustering With Dynamic Interest Tracking;IEEE Transactions on Knowledge and Data Engineering;2023-09-01

2. Corporate Risk Information Disclosure Based on Semantic Analysis Methods;Mobile Information Systems;2022-04-12

3. BATS: A Spectral Biclustering Approach to Single Document Topic Modeling and Segmentation;ACM Transactions on Intelligent Systems and Technology;2021-10-31

4. GTAE: Graph Transformer–Based Auto-Encoders for Linguistic-Constrained Text Style Transfer;ACM Transactions on Intelligent Systems and Technology;2021-06-11

5. Channel retrieval: finding relevant broadcasters on Telegram;Social Network Analysis and Mining;2020-03-30

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3