Abstract
This study introduces an advanced approach to improving Interactive Voice Response (IVR) systems for mobile banking by integrating emotion analysis with a fusion of specialized datasets. Utilizing the RAVDESS, CREMA-D, TESS, and SAVEE datasets, this research exploits a diverse array of emotional speech and song samples to analyze customer sentiment in call center interactions. These datasets provide a multi-modal emotional context that significantly enriches the IVR experience.
The cornerstone of our methodology is the implementation of Mel-Frequency Cepstral Coefficients (MFCC) Extraction. The MFCCs, extracted from audio inputs, form a 2D array where time and cepstral coefficients create a structure that closely resembles an image. This format is particularly suitable for Convolutional Neural Networks (CNNs), which excel in interpreting such 'image-like' data for emotion recognition, hence enhancing the system's responsiveness to emotional cues.
Proposed system's architecture is adeptly designed to modify dialogue flows dynamically, informed by the emotional tone of customer interactions. This innovation not only improves customer engagement but also ensures a seamless handover to human operators when the situation calls for a personal touch, optimizing the balance between automated efficiency and human empathy.
The results of this research demonstrate the potential of emotion-aware IVR systems to anticipate and meet customer needs more effectively, paving the way for a new standard in user-centric banking services.
Publisher
Orclever Science and Research Group
Reference28 articles.
1. R. Alt, R. Beck, and M. T. Smits, “FinTech and the transformation of the financial industry,” Electronic markets, vol. 28. Springer, pp. 235–243, 2018.
2. P. Manatsa, “An analysis of the impact of implementing a new interactive voice response system (IVR) on client experience in the Canadian Banking Industry,” 2019.
3. R. A. Feinberg, L. Hokama, R. Kadam, and I. Kim, “Operational determinants of caller satisfaction in the banking/financial services call center,” International Journal of Bank Marketing, vol. 20, no. 4, pp. 174–180, 2002.
4. S. M. Yacoub, S. J. Simske, X. Lin, and J. Burns, “Recognition of emotions in interactive voice response systems.,” in Interspeech, 2003.
5. L. E. Rocha, D. M. R. Glina, M. de Fatimá Marinho, and D. Nakasato, “Risk factors for musculoskeletal symptoms among call center operators of a bank in Sao Paulo, Brazil,” Ind Health, vol. 43, no. 4, pp. 637–646, 2005.