Abstract
Tamil palm leaf manuscripts serve as invaluable cultural heritage repositories, housing a wealth of ancient wisdom spanning medical prescriptions and spiritual hymns. However, their profound significance is matched by the complexity of deciphering the sentiments they convey, attributed to their multimodal (text and visual content) and multilingual (Tamil and Sanskrit) nature. This study presents a Deep Learning-Based Cultural Emotion Analyzer (CEA-MMSA) designed for the multimodal and multilingual sentiment analysis of Tamil and Sanskrit Siddha palm leaf manuscripts. These manuscripts are invaluable cultural artifacts, containing ancient wisdom in complex textual and visual formats. Our innovative approach leverages Vision Transformers (ViTs) for visual sentiment analysis and Gated Recurrent Units (GRUs) with attention mechanisms for textual sentiment analysis, facilitating a nuanced understanding of emotional content. The proposed multimodal fusion model enhances data interpretation by integrating textual and visual sentiments, addressing the intricacies of the manuscripts' linguistic aspects. Empirical results demonstrate the efficacy of our methodology, achieving an accuracy of 97.38%, with precision at 96.87%, recall at 95.34%, and an F1 score of 95.37%. This advancement not only enriches the study and preservation of these manuscripts but also illuminates the emotional and cultural narratives encapsulated within them.