On Finetuning Large Language Models-Reference-Cited by-同舟云学术

On Finetuning Large Language Models

Published:2023-11-28 Issue: Volume: Page:1-5
ISSN:1047-1987
Container-title:Political Analysis
language:en
Short-container-title:Polit. Anal.

Author:

Wang Yu^ORCID

Abstract

Abstract A recent paper by Häffner et al. (2023, Political Analysis 31, 481–499) introduces an interpretable deep learning approach for domain-specific dictionary creation, where it is claimed that the dictionary-based approach outperforms finetuned language models in predictive accuracy while retaining interpretability. We show that the dictionary-based approach’s reported superiority over large language models, BERT specifically, is due to the fact that most of the parameters in the language models are excluded from finetuning. In this letter, we first discuss the architecture of BERT models, then explain the limitations of finetuning only the top classification layer, and lastly we report results where finetuned language models outperform the newly proposed dictionary-based approach by 27% in terms of

$R^2$

and 46% in terms of mean squared error once we allow these parameters to learn during finetuning. Researchers interested in large language models, text classification, and text regression should find our results useful. Our code and data are publicly available.

Publisher

Cambridge University Press (CUP)

Subject

Political Science and International Relations,Sociology and Political Science

Reference15 articles.

1. Zhang, T. , Wu, F. , Katiyar, A. , Weinberger, K. Q. , and Artzi, Y. . 2021. “Revisiting Few-Sample BERT Fine-Tuning.” ICLR.

2. Parameter-efficient fine-tuning of large-scale pre-trained language models

3. Dodge, J. , Ilharco, G. , Schwartz, R. , Farhadi, A. , Hajishirzi, H. , and Smith, N. . 2020. “Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping.” Preprint, arXiv:2002.06305.

4. Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Large language models for depression prediction;Proceedings of the National Academy of Sciences;2024-07-25

2. HELIOS Approach: Utilizing AI and LLM for Enhanced Homogeneity Identification in Real Estate Market Analysis;Applied Sciences;2024-07-15

3. Bridging prediction and theory: Introducing the Bayesian Partially-Protected Lasso;Electoral Studies;2024-02

4. Enhancing Zero-Shot Crypto Sentiment With Fine-Tuned Language Model and Prompt Engineering;IEEE Access;2024