LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models

Author:

Athukoralage Dasun,Atapattu Thushari,Thilakaratne Menasha,Falkner Katrina

Abstract

AbstractThis paper presents our approaches for the SMM4H’24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. Our first approach involves fine-tuning a single RoBERTa-large model, while the second approach entails ensembling the results of three fine-tuned BERTweet-large models. We demonstrate that although both approaches exhibit identical performance on validation data, the BERTweet-large ensemble excels on test data. Our best-performing system achieves an F1-score of 0.938 on test data, out-performing the benchmark classifier by 1.18%.

Publisher

Cold Spring Harbor Laboratory

Reference12 articles.

1. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping;arXiv preprint,2020

2. Twitter mining using semi-supervised classification for relevance filtering in syndromic surveillance;PLoS One,2019

3. Language of adhd in adults on social media;Journal of Attention Disorders,2019

4. Yuting Guo , Xiangjue Dong , Mohammed Ali Al-Garadi , Abeed Sarker , Cécile Paris , and Diego Mollá-Aliod . 2020. Benchmarking of transformer-based pretrained models on social media text classification datasets. In Workshop of the Australasian Language Technology Association, pages 86–91.

5. Using Twitter to Detect Psychological Characteristics of Self-Identified Persons With Autism Spectrum Disorder: A Feasibility Study

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3