Abstract
ObjectivesTo assess the feasibility of using a natural language processing (NLP) application for extraction of free-text online activity mentions in adolescent mental health patient electronic health records (EHRs).SettingThe Clinical Records Interactive Search system allows detailed research based on deidentified EHRs from the South London and Maudsley NHS Foundation Trust, a large south London Mental Health Trust providing secondary and tertiary mental healthcare.Participants and methodsWe developed a gazetteer of online activity terms and annotation guidelines, from 5480 clinical notes (200 adolescents, aged 11–17 years) receiving specialist mental healthcare. The preprocessing and manual curation steps of this real-world data set allowed development of a rule-based NLP application to automate identification of online activity (internet, social media, online gaming) mentions in EHRs. The context of each mention was also recorded manually as: supportive, detrimental or neutral in a subset of data for additional analysis.ResultsThe NLP application performed with good precision (0.97) and recall (0.94) for identification of online activity mentions. Preliminary analyses found 34% of online activity mentions were considered to have been documented within a supportive context for the young person, 38% detrimental and 28% neutral.ConclusionOur results provide an important example of a rule-based NLP methodology to accurately identify online activity recording in EHRs, enabling researchers to now investigate associations with a range of adolescent mental health outcomes.
Funder
Medical Research Council
NIHR Maudsley Biomedical Research Centre
National Institute for Health Research