Hybrid unstructured text features for meta-heuristic assisted deep CNN-based hierarchical clustering

Author:

Jyothi Bankapalli1,Sumalatha L.2,Eluri Suneetha1

Affiliation:

1. Computer Science and Engineering, JNTUK Kakinada, Kakinada, Andhra Pradesh, India

2. Computer Science and Engineering, Jawaharlal Nehru Technological University, Hyderabad, Telangana, India

Abstract

The text clustering model becomes an essential process to sort the unstructured text data in an appropriate format. But, it does not give the pave for extracting the information to facilitate the document representation. In today’s date, it becomes crucial to retrieve the relevant text data. Mostly, the data comprises an unstructured text format that it is difficult to categorize the data. The major intention of this work is to implement a new text clustering model of unstructured data using classifier approaches. At first, the unstructured data is taken from standard benchmark datasets focusing on both English and Telugu languages. The collected text data is then given to the pre-processing stage. The pre-processed data is fed into the model of the feature extraction stage 1, in which the GloVe embedding technique is used for extracting text features. Similarly, in the feature extraction stage 2, the pre-processed data is used to extract the deep text features using Text Convolutional Neural Network (Text CNN). Then, the text features from Stage 1 and deep features from Stage 2 are all together and employed for optimal feature selection using the Hybrid Sea Lion Grasshopper Optimization (HSLnGO), where the traditional SLnO is superimposed with GOA. Finally, the text clustering is processed with the help of Deep CNN-assisted hierarchical clustering, where the parameter optimization is done to improve the clustering performance using HSLnGO. Thus, the simulation findings illustrate that the framework yields impressive performance of text classification in contrast with other techniques while implementing the unstructured text data using different quantitative measures.

Publisher

IOS Press

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction,Software

Reference37 articles.

1. Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm;Skabar;IEEE Transactions on Knowledge and Data Engineering,2013

2. Discovering Topic Representative Terms for Short Text Clustering;Yang;IEEE Access,2019

3. An Efficient Concept-Based Mining Model for Enhancing Text Clustering;Shehata;IEEE Transactions on Knowledge and Data Engineering,2010

4. Neural Feedback Text Clustering With BiLSTM-CNN-Kmeans;Yang;IEEE Access,2018

5. Document Clustering for Forensic Analysis: An Approach for Improving Computer Inspection;da Cruz Nassif;IEEE Transactions on Information Forensics and Security,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3