ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING-Reference-Cited by-同舟云学术

ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING

Published:2022-12-16 Issue: Volume: Page:41-51
ISSN:2147-9364
Container-title:Konya Journal of Engineering Sciences
language:en
Short-container-title:KONJES

Author:

TAŞKIRAN Salimkan Fatma¹,KAYA Ersin¹^ORCID

Affiliation:

1. KONYA TECHNICAL UNIVERSITY

Abstract

Accessing data is very easy nowadays. However, to use these data in an efficient way, it is necessary to get the right information from them. Categorizing these data in order to reach the needed information in a short time provides great convenience. All the more, while doing research in the academic field, text-based data such as articles, papers, or thesis studies are generally used. Natural language processing and machine learning methods are used to get the right information we need from these text-based data. In this study, abstracts of academic papers are clustered. Text data from academic paper abstracts are preprocessed using natural language processing techniques. A vectorized word representation extracted from preprocessed data with Word2Vec and BERT word embeddings and representations are clustered with four clustering algorithms.

Publisher

Konya Muhendislik Bilimleri Dergisi

Subject

General Medicine

Reference37 articles.

1. Adalı, E. (2012). Doğal Dil İşleme. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi, 5(2).

2. Aggarwal, C. C., & Zhai, C. (2012). A survey of text clustering algorithms. In Mining text data (pp. 77-128): Springer.

3. Alexandrov, M., Gelbukh, A., & Rosso, P. (2005). An approach to clustering abstracts. Paper presented at the International Conference on Application of Natural Language to Information Systems.

4. Amasyali, M. F., Balc1, S., Mete, E., & Varl1, E. N. (2012). Türkçe Metinlerin Sınıflandırılmasında Metin Temsil Yöntemlerinin Performans Karşılaştırılması / A Comparison of Text Representation Methods for Turkish Text Classification.

5. Amasyalı, M. F., & Diri, B. (2006). Automatic Turkish text categorization in terms of author, genre and gender. International Conference on Application of Natural Language to Information Systems,