Abstract
ObjectiveThis study aims to (1) build and validate model-based case definitions for multiple sclerosis (MS) that use trends (ie, trend-based case definitions) and (2) to apply dynamic classification to identify the average number of data years needed for classification (ie, average trend needed).DesignRetrospective cohort study design.Participants608 MS cases and 59 620 MS non-cases.SettingData from 1 April 2004 to 31 March 2022 were obtained from the Manitoba Population Research Data Repository. MS case status was ascertained from homecare records and linked to health data. Trend-based case definitions were constructed using multivariate generalised linear mixed models applied to annual numbers of general and specialist physician visits, hospitalisations and MS healthcare contacts or medication dispensations. Dynamic classification, which ascertains cases and non-cases annually, was used to estimate mean classification time. Classification accuracy performance measures, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), proportion correctly classified (PCC) and F1-scores, were compared for trend-based case definitions and a deterministic case definition of 3+MS healthcare contacts or medication dispensations.ResultsWhen applied to the full study period, classification accuracy performance measure estimates for all case definitions exceeded 0.90, except sensitivity and PPV for the trend-based dynamic case definition (0.88, 0.64, respectively). PCC was high for all case definitions (0.94–0.99); F1-scores were lower for the trend-based case definitions compared with the deterministic case definition (0.74–0.93 vs 0.96). Dynamic classification identified 5 years as the average trend needed. When applied to the average trend windows, accuracy estimates for trend-based case definitions were lower than the estimates from the full study period (sensitivity: 0.77–0.89; specificity: 0.90–0.97; PPV: 0.54–0.81; NPV: 0.97–0.99; F1-score: 0.64–0.84). Accuracy estimates for the deterministic case definition remained high, except sensitivity (0.42–0.80). F1-score was variable (0.59–0.89).ConclusionsTrend-based and deterministic case definitions classifications were similar to a population-based clinician assessment reference standard for multiple measures of classification accuracy. However, accuracy estimates for both trend-based and deterministic case definitions varied as the years of data used for classification were reduced. Dynamic classification appears to be a viable option for identifying the average trend needed for trend-based case definitions.
Funder
Natural Sciences and Engineering Research Council of Canada
Research Manitoba
Waugh Family Chair in Multiple Sclerosis
Visual and Automated Disease Analytics Trainee Program
Canadian Institutes of Health Research