Comparative Analysis of NLP-Based Models for Company Classification-Reference-Cited by-同舟云学术

Comparative Analysis of NLP-Based Models for Company Classification

Published:2024-01-31 Issue:2 Volume:15 Page:77
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Rizinski Maryan¹²^ORCID,Jankov Andrej²^ORCID,Sankaradas Vignesh¹,Pinsky Eugene¹^ORCID,Mishkovski Igor²^ORCID,Trajanov Dimitar¹²^ORCID

Affiliation:

1. Department of Computer Science, Metropolitan College, Boston University, Boston, MA 02215, USA

2. Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, 1000 Skopje, North Macedonia

Abstract

The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow, costly, and vendor-specific assignments. Therefore, we investigate recent natural language processing (NLP) advancements to automate the company classification process. In particular, we employ and evaluate various NLP-based models, including zero-shot learning, One-vs-Rest classification, multi-class classifiers, and ChatGPT-aided classification. We conduct a comprehensive comparison among these models to assess their effectiveness in the company classification task. The evaluation uses the Wharton Research Data Services (WRDS) dataset, consisting of textual descriptions of publicly traded companies. Our findings reveal that the RoBERTa and One-vs-Rest classifiers surpass the other methods, achieving F1 scores of 0.81 and 0.80 on the WRDS dataset, respectively. These results demonstrate that deep learning algorithms offer the potential to automate, standardize, and continuously update classification systems in an efficient and cost-effective way. In addition, we introduce several improvements to the multi-class classification techniques: (1) in the zero-shot methodology, we TF-IDF to enhance sector representation, yielding improved accuracy in comparison to standard zero-shot classifiers; (2) next, we use ChatGPT for dataset generation, revealing potential in scenarios where datasets of company descriptions are lacking; and (3) we also employ K-Fold to reduce noise in the WRDS dataset, followed by conducting experiments to assess the impact of noise reduction on the company classification results.

Publisher

MDPI AG

Link

https://www.mdpi.com/2078-2489/15/2/77/pdf

Reference58 articles.

1. Deep learning for financial applications: A survey;Ozbayoglu;Appl. Soft Comput.,2020

2. Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis;Goodell;J. Behav. Exp. Financ.,2021

3. Kumar, S., Sharma, D., Rao, S., Lim, W.M., and Mangla, S.K. (2022). Past, present, and future of sustainable finance: Insights from big data analytics through machine learning of scholarly research. Ann. Oper. Res., 1–44.

4. Deep learning in business analytics and operations research: Models, applications and managerial implications;Kraus;Eur. J. Oper. Res.,2020

5. Research challenges and opportunities in business analytics;Delen;J. Bus. Anal.,2018

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Artificial Intelligence-Driven Corporate Finance: Enhancing Efficiency and Decision-Making Through Machine Learning, Natural Language Processing, and Robotic Process Automation in Corporate Governance and Sustainability;SSRN Electronic Journal;2024