Affiliation:
1. Indiana University USA
Abstract
ABSTRACTThis article proposes a multilabel poem topic classification algorithm utilizing large language models and auxiliary data to address the lack of diverse metadata in digital poetry libraries. The study examines the potential of context‐dependent language models, specifically bidirectional encoder representations from transformers (BERT), for understanding poetic words and utilizing auxiliary data, such as author's notes, in supplementing poetry text. The experimental results demonstrate that the BERT‐based model outperforms the traditional support vector machine‐based model across all input types and datasets. We also show that incorporating notes as an additional input improves the performance of the poem‐only model. Overall, the study suggests pretrained context‐dependent language models and auxiliary data have potential to enhance the accessibility of various poems within collections. This research can eventually assist in promoting the discovery of underrepresented poems in digital libraries, even if they lack associated metadata, thus enhancing the understanding and appreciation of the literary form.
Subject
Library and Information Sciences,General Computer Science
Reference24 articles.
1. Academy of American Poets. (n.d.).About us. Poets.org. Retrieved April 16 2023 fromhttps://poets.org/academy-american-poets/about-us
2. Lockdown poetry, healing and the COVID-19 pandemic
3. Enriching Word Vectors with Subword Information
4. Automatic tagging using deep convolutional neural networks;Choi K.;Proceedings of Conference of The International Society for Music Information Retrieval,2016
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere in Between?;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14