Fine-tuning Strategies for Classifying Community-Engaged Research Studies Using Transformer-Based Models: Algorithm Development and Improvement Study

Author:

Ferrell Brian JORCID

Abstract

Background Community-engaged research (CEnR) involves institutions of higher education collaborating with organizations in their communities to exchange resources and knowledge to benefit a community’s well-being. While community engagement is a critical aspect of a university's mission, tracking and reporting CEnR metrics can be challenging, particularly in terms of external community relations and federally funded research programs. In this study, we aimed to develop a method for classifying CEnR studies that have been submitted to our university's institutional review board (IRB) to capture the level of community involvement in research studies. Tracking studies in which communities are “highly engaged” enables institutions to obtain a more comprehensive understanding of the prevalence of CEnR. Objective We aimed to develop an updated experiment to classify CEnR and capture the distinct levels of involvement that a community partner has in the direction of a research study. To achieve this goal, we used a deep learning–based approach and evaluated the effectiveness of fine-tuning strategies on transformer-based models. Methods In this study, we used fine-tuning techniques such as discriminative learning rates and freezing layers to train and test 135 slightly modified classification models based on 3 transformer-based architectures: BERT (Bidirectional Encoder Representations from Transformers), Bio+ClinicalBERT, and XLM-RoBERTa. For the discriminative learning rate technique, we applied different learning rates to different layers of the model, with the aim of providing higher learning rates to layers that are more specialized to the task at hand. For the freezing layers technique, we compared models with different levels of layer freezing, starting with all layers frozen and gradually unfreezing different layer groups. We evaluated the performance of the trained models using a holdout data set to assess their generalizability. Results Of the models evaluated, Bio+ClinicalBERT performed particularly well, achieving an accuracy of 73.08% and an F1-score of 62.94% on the holdout data set. All the models trained in this study outperformed our previous models by 10%-23% in terms of both F1-score and accuracy. Conclusions Our findings suggest that transfer learning is a viable method for tracking CEnR studies and provide evidence that the use of fine-tuning strategies significantly improves transformer-based models. Our study also presents a tool for categorizing the type and volume of community engagement in research, which may be useful in addressing the challenges associated with reporting CEnR metrics.

Publisher

JMIR Publications Inc.

Subject

Health Informatics,Medicine (miscellaneous)

Reference29 articles.

1. Principles of community engagementCenters for Disease Control and Prevention19972023-01-26https://www.atsdr.cdc.gov/communityengagement/pdf/PCE_Report_508_FINAL.pdf

2. Attention-Based Models for Classifying Small Data Sets Using Community-Engaged Research Protocols: Classification System Development and Validation Pilot Study

3. ErhanDCourvilleABengioYVincentPWhy does unsupervised pre-training help deep learning?2010Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics13-15 May 2010Sardinia, Italy

4. Pre-trained models for natural language processing: A survey

5. Universal Language Model Fine-tuning for Text Classification

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3