Affiliation:
1. Department of English, The Chinese University of Hong Kong , Hong Kong SAR , People’s Republic of China
Abstract
Abstract
This paper highlights a language and sign-based computational solution to the problem of missing social metadata on Twitter (now, ‘X’): demographic prediction using Deep Learning. It aims to apply this method to variationist sociolinguistics research, illustrating how the approach can facilitate analyses with missing metadata (i.e. stylistic age and sex/gender) by deriving this metadata solely from publicly available linguistic and semiotic resources on Twitter profiles (e.g. display pictures and biographies). I use my investigations of English tweets from the Philippines and Hong Kong as case examples, examining the extent to which the use of the copula and the use of will-shall modals on social media are conditioned by diachronic factors as well as factors internal and external to language (e.g. social factors). The results reveal the influence of stylistic gender and age as well as other factors on patterns of variation. They offer a glimpse into the nuanced sociolinguistic aspects of language usage on social media, highlighting the advantages of utilizing AI-powered Deep Learning to tackle data-related challenges. The discoveries and methodology hold the possibility of influencing other fields and practical situations beyond the study of language and society.
Publisher
Oxford University Press (OUP)
Reference78 articles.
1. Co-training for demographic classification using deep learning from label proportions;Ardehaly,2017
2. ‘quanteda: An R package for the quantitative analysis of textual data,’;Benoit;Journal of Open Source Software,2018
3. ‘The sociolinguistics of Hong Kong and the space for Hong Kong English,’;Bolton;World Englishes,2000
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献