BACKGROUND
Despite significant time spent on billing, family physicians routinely make errors and miss billing opportunities. In other disciplines, machine learning models have predicted current procedural terminology codes with high accuracy.
OBJECTIVE
Our objective was to derive machine learning models capable of predicting diagnosis and billing codes from notes recorded in the electronic medical record.
METHODS
We conducted a retrospective algorithm development and validation study involving an academic family medicine practice. Visits between July 1, 2015 and June 30, 2020 containing a physician-authored note and an invoice in the electronic medical record were eligible for inclusion. We trained two deep learning models and compared their predictions to codes submitted for reimbursement. We calculated accuracy, recall, precision, F1 score and area under the receiver operating curve.
RESULTS
245,045 visits were eligible for inclusion. 198,802 (81%) were included in model development. Accuracy was 99.7% and 99.6% for the diagnosis and billing code models, respectively. Recall was 66.4% and 70.2% for the diagnosis and billing code models, respectively. Precision was 38.0% and 72.8% for the diagnosis and billing code models, respectively. Area under the curve was 0.988 for the diagnosis code model and 0.933 for the billing code model.
CONCLUSIONS
We developed models capable of predicting diagnosis and billing codes from electronic notes following visits to family medicine. The billing model outperformed the diagnosis model in terms of recall and precision likely due to fewer codes being predicted. Work is underway to further enhance model performance and assess the generalizability of these models to other family medicine practices.