An augmented semantic search tool for multilingual news analytics

Author:

Harikumar Sandhya1,Sathyajit Rohit1,Karumudi Gnana Venkata Naga Sai Kalyan1

Affiliation:

1. Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India

Abstract

News feeds generate colossal amount of data consisting of important information hidden in the intricacies. State of the art methods are still at infancy in providing a very generic and publicly available solution to skim through the important information in the news from various sources and an ability to search using specific keywords in different languages. This paper focuses on designing a tool to extract semantic details from news articles published through various internet sources in various languages. The semantic information is stored within DBMS for ease of organizing and retrieving the data. Further, a querying facility to search through entire articles based on the keyword or date-based search is also proposed to view the crisp content. The news articles in English, and two Indian languages - Hindi and Malayalam are considered for experimentation. The proposed strategy consists of two main components namely, Generative model creation and Query engine. Generative model aims to extract important entities and keywords along with their relevance to the article and other similar articles using Latent Dirichlet Allocation(LDA) and Named Entity Recognition(NER). Query engine is to facilitate on the fly retrieval of semantic content from the database, based on user keyword. The search engine, along with database indexing, reduces the access time to the database thereby retrieving the information in less time. Experimental results show that the proposed method is effective in terms of quality of information and time consumed for information retrieval.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference9 articles.

1. Latent Dirichlet Allocation;Blei;The Journal of Machine Learning Research,2001

2. Social media analytics: a survey of techniques, tools and platforms;Batrinca;AI & SOCIETY,2014

3. Automatic Keyword Extraction for Text Summarization in Multi-document e-Newspapers Articles;Bharti;European Journal of Advances in Engineering and Technology,2017

4. A comprehensive study of text mining approach;Kaushik;International Journal of Computer Science and Network Security (IJCSNS),2016

5. Historical Research Approaches to the Analysis of Internationalisation;Buckley;Management International Review,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3