Graph-Based Extractive Text Summarization Sentence Scoring Scheme for Big Data Applications-Reference-Cited by-同舟云学术

Graph-Based Extractive Text Summarization Sentence Scoring Scheme for Big Data Applications

Published:2023-08-22 Issue:9 Volume:14 Page:472
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Verma Jai Prakash¹,Bhargav Shir¹,Bhavsar Madhuri¹,Bhattacharya Pronaya²³^ORCID,Bostani Ali⁴,Chowdhury Subrata⁵^ORCID,Webber Julian⁶^ORCID,Mehbodniya Abolfazl⁶^ORCID

Affiliation:

1. Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad 382481, Gujarat, India

2. Department of Computer Science and Engineering, Amity School of Engineering and Technology, Amity University, Kolkata 700135, West Bengal, India

3. Research and Innovation Cell, Amity University, Kolkata 700135, West Bengal, India

4. College of Engineering and Applied Sciences, American University of Kuwait, Salmiya 20002, Kuwait

5. Department of Computer Science and Engineering, Sreenivasa Institute of Technology and Management Studies, Chittoor 517127, Andra Pradesh, India

6. Department of Electronics and Communication Engineering, Kuwait College of Science and Technology (KCST), 7th Ring Road, Kuwait City 13133, Kuwait

Abstract

The recent advancements in big data and natural language processing (NLP) have necessitated proficient text mining (TM) schemes that can interpret and analyze voluminous textual data. Text summarization (TS) acts as an essential pillar within recommendation engines. Despite the prevalent use of abstractive techniques in TS, an anticipated shift towards a graph-based extractive TS (ETS) scheme is becoming apparent. The models, although simpler and less resource-intensive, are key in assessing reviews and feedback on products or services. Nonetheless, current methodologies have not fully resolved concerns surrounding complexity, adaptability, and computational demands. Thus, we propose our scheme, GETS, utilizing a graph-based model to forge connections among words and sentences through statistical procedures. The structure encompasses a post-processing stage that includes graph-based sentence clustering. Employing the Apache Spark framework, the scheme is designed for parallel execution, making it adaptable to real-world applications. For evaluation, we selected 500 documents from the WikiHow and Opinosis datasets, categorized them into five classes, and applied the recall-oriented understudying gisting evaluation (ROUGE) parameters for comparison with measures ROUGE-1, 2, and L. The results include recall scores of 0.3942, 0.0952, and 0.3436 for ROUGE-1, 2, and L, respectively (when using the clustered approach). Through a juxtaposition with existing models such as BERTEXT (with 3-gram, 4-gram) and MATCHSUM, our scheme has demonstrated notable improvements, substantiating its applicability and effectiveness in real-world scenarios.

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/14/9/472/pdf

Reference98 articles.

1. An Opinion Mining Approach to Handle Perspectivism and Ambiguity: Moving Toward Neutrosophic Logic;Essameldin;IEEE Access,2022

2. Online Context-Aware Task Assignment in Mobile Crowdsourcing via Adaptive Discretization;Elahi;IEEE Trans. Netw. Sci. Eng.,2023

3. Hassani, H., Beneki, C., Unger, S., Mazinani, M.T., and Yeganegi, M.R. (2020). Text Mining in Big Data Analytics. Big Data Cogn. Comput., 4.

4. A social media analytics perspective for human-oriented smart city planning and management;Miah;J. Assoc. Inf. Sci. Technol.,2022

5. SaTYa: Trusted Bi-LSTM-Based Fake News Classification Scheme for Smart Community;Bhattacharya;IEEE Trans. Comput. Soc. Syst.,2022

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Developing Gujarati Article Summarization Utilizing Improved Page-Rank System;International Journal of Scientific Research in Computer Science, Engineering and Information Technology;2024-03-28

2. Document Summarization Leveraging Modified LexRank Algorithm;Lecture Notes in Networks and Systems;2024