Affiliation:
1. School of Hispanic and Portuguese Studies, Beijing Foreign Studies University, Beijing 100089, China
2. School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031, China
Abstract
Spain possesses a vast number of poems. Most have features that mean they present significantly different styles. A superficial reading of these poems may confuse readers due to their complexity. Therefore, it is of vital importance to classify the style of the poems in advance. Currently, poetry classification studies are mostly carried out manually, which creates extremely high requirements for the professional quality of classifiers and consumes a large amount of time. Furthermore, the objectivity of the classification cannot be guaranteed because of the influence of the classifier’s subjectivity. To solve these problems, a Spanish poetry classification framework was designed using artificial intelligence technology, which improves the accuracy, efficiency, and objectivity of classification. First, an artificial-intelligence-driven Spanish poetry classification framework is described in detail, and is illustrated by a framework diagram to clearly represent each step in the process. The framework includes many algorithms and models, such as the Term Frequency–Inverse Document Frequency (TF_IDF), Bagging, Support Vector Machines (SVMs), Adaptive Boosting (AdaBoost), logistic regression (LR), Gradient Boosting Decision Trees (GBDT), LightGBM (LGB), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF). The roles of each algorithm in the framework are clearly defined. Finally, experiments were performed for model selection, comparing the results of these algorithms.The Bagging model stood out for its high accuracy, and the experimental results showed that the proposed framework can help researchers carry out poetry research work more efficiently, accurately, and objectively.
Funder
National Natural Science Foundation of China
Subject
Artificial Intelligence,Computer Science Applications,Information Systems,Management Information Systems
Reference35 articles.
1. Cavnar, W.B., and Trenkle, J.M. (1994, January 11–13). N-gram-based text categorization. Proceedings of the SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, NV, USA.
2. Lewis, D.D. (1992, January 23–26). Feature selection and feature extraction for text categorization. Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, NY, USA.
3. KNN based machine learning approach for text and document mining;Bijalwan;Int. J. Database Theory Appl.,2014
4. Larkey, L.S., and Croft, W.B. (1996, January 18–22). Combining classifiers in text categorization. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland.
5. Gauging similarity with n-grams: Language-independent categorization of text;Damashek;Science,1995