Adaptive encoding-based evolutionary approach for Chinese document clustering-Reference-Cited by-同舟云学术

Adaptive encoding-based evolutionary approach for Chinese document clustering

Published:2022-12-10 Issue: Volume: Page:
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Chen Jun-Xian,Gong Yue-Jiao,Chen Wei-Neng,Xiao Xiaolin^ORCID

Abstract

AbstractDocument clustering has long been an important research direction in intelligent system. When being applied to process Chinese documents, new challenges were posted since it is infeasible to directly split the Chinese documents using the whitespace character. Moreover, many Chinese document clustering algorithms require prior knowledge of the cluster number, which is impractical to know in real-world applications. Considering these problems, we propose a general Chinese document clustering framework, where the main clustering task is fulfilled with an adaptive encoding-based evolutionary approach. Specifically, the adaptive encoding scheme is proposed to automatically learn the cluster number, and novel crossover and mutation operators are designed to fit this scheme. In addition, a single step of K-means is incorporated to conduct a joint global and local search, enhancing the overall exploitation ability. The experiments on benchmark datasets demonstrate the superiority of the proposed method in both the efficiency and the clustering precision.

Funder

2022 Guangdong-Hong Kong-Macao Greater Bay Area Exchange Programs of SCNU

National Natural Science Foundation of China

Guangdong Natural Science Funds for Distinguished Young Scholars

Guangdong Regional Joint Fund for Basic and Applied Research

Fundamental Research Funds for the Central Universities

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Engineering (miscellaneous),Information Systems,Artificial Intelligence

Link

https://link.springer.com/content/pdf/10.1007/s40747-022-00934-z.pdf