Author:
Ohata Elene F.,Mattos César Lincoln C.,Rêgo Paulo Antonio L.
Abstract
Text classification has been a core component of several applications. Modern machine learning operations strategies address challenges in deploying and maintaining models in production environments. In this work, we describe and experiment with a pipeline for monitoring and updating a text classification tool deployed in a major information technology company. The proposed fully automatic approach also enables visual inspection of its operations via dashboards. The solution is thoroughly evaluated in two experimental scenarios: a static one, focusing on the Natural Language Processing (NLP) and Machine Learning (ML) stages to build the text classifier; and a dynamic one, where the pipeline enables automatic model updates. The obtained results are promising and indicate the validity of the implemented methodology.
Publisher
Sociedade Brasileira de Computação - SBC
Reference18 articles.
1. Alla, S., Adari, S. K., Alla, S., and Adari, S. K. (2021). What is mlops? Beginning MLOps with MLFlow: Deploy Models in AWS SageMaker, Google Cloud, and Microsoft Azure, pages 79–124.
2. Arias-Barahona, M. X., Arteaga-Arteaga, H. B., Orozco-Arias, S., Flórez-Ruíz, J. C., Valencia-Díaz, M. A., and Tabares-Soto, R. (2023). Requests classification in the customer service area for software companies using machine learning and natural language processing. PeerJ Computer Science, 9:e1016.
3. Borg, A., Boldt, M., Rosander, O., and Ahlstrand, J. (2021). E-mail classification with machine learning and word embeddings for improved customer support. Neural Computing and Applications, 33(6):1881–1902.
4. Cahyani, D. E. and Patasik, I. (2021). Performance comparison of tf-idf and word2vec models for emotion text classification. Bulletin of Electrical Engineering and Informatics, 10(5):2780–2788.
5. Cawley, G. C. and Talbot, N. L. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. The Journal of Machine Learning Research, 11:2079–2107.