Abstract
AbstractObjectiveTo develop and validate a novel application of natural language processing (NLP) techniques to measure pediatrician adherence to evidence-based guidelines in the treatment of young children with attention-deficit/hyperactivity disorder (ADHD).Materials and MethodsWe extracted structured and free text data from electronic health records of all office visits (2015-2019) of children aged 4-6 years seen in a community-based primary healthcare network in California, who had ≥1 visits with an ICD-10 diagnosis of ADHD. Two pediatricians manually annotated clinical notes of the first ADHD visit for 423 patients. Inter-annotator agreement was assessed for recommendation for first-line behavioral treatment; Disagreements were reconciled. The BioClinical Bidirectional Encoder Representations from Transformers (BioClinical-BERT) was used to identify mentions of behavioral treatment recommendations using a 70/30 train/test split. Following an error analysis and threshold selection, we completed external (temporal) validation by deploying the model on 1,020 unannotated notes representing other ADHD visits and well-care visits; all positively classified notes and 5% of negatively classified notes were annotated.ResultsOf 423 included patients, 313 (74%) were male, 268 (63%) were privately insured; 138 (33%) were white; 61 (14 %) were Hispanic. The BERT model of first ADHD visits achieved F1=0.78, precision=0.84, and recall=0.72. Following threshold selection, temporal validation on notes from other visits achieved F1=0.80, recall=0.92 and precision=0.7.ConclusionDeploying a machine learning algorithm on a large and variable set of clinical notes accurately captured pediatrician adherence to guidelines in treatment of children with ADHD. This approach can be used to measure quality-of-care at scale and improve clinical care for various chronic medical conditions.
Publisher
Cold Spring Harbor Laboratory