Classifications for Radiographic Evaluation of Radiolucent Bone Lesions have Poor Inter- and Intra-observer Agreement-Reference-Cited by-同舟云学术

Classifications for Radiographic Evaluation of Radiolucent Bone Lesions have Poor Inter- and Intra-observer Agreement

Published:2024-07-13 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Willenbring Taylor J.¹,Papa Sarah M.¹,Mann Kenneth A.¹,Cavallaro Salvatore¹,Damron Timothy A.¹

Affiliation:

1. SUNY Upstate Medical University

Abstract

Background Radiolucent bone lesions are encountered in all orthopedic specialties, and concise description is essential to inform evaluation and treatment. We studied the interobserver reliability and intra-observer reproducibility of three classification systems of radiographic radiolucent lesions: (1) original Lodwick classification, (2) modified Lodwick classification, and (3) Enneking classification for benign tumors. We hypothesized that intra-observer reproducibility would be good but interobserver reliability would be poor, improving with training level, and highest for the Enneking classification. Methods Forty-eight case sets of de-identified radiographs of radiolucent osseous lesions were selected from an orthopedic oncology practice. Each set included two orthogonal views of the lesion from initial presentation. Twenty participants (one third-year medical student, 18 residents, one orthopedic oncologist) classified each case twice, with a minimum two-week gap between sessions, according to the Lodwick classification, modified Lodwick classification, and Enneking classification. Interobserver reliability and intra-observer reproducibility were calculated using Fleiss’ kappa and Krippendorff’s alpha, treating the classifications as nominal and ordinal rankings, respectively. Linear regression models were used to determine the effect of training level on reproducibility. Contingency tables were used to assess the accuracy of correctly identifying benign versus malignant lesions against their known diagnoses. Results Interobserver reliability was poor, as demonstrated by agreement of 39% (κ = 0.23; α = 0.54), 39% (κ = 0.25; α = 0.48), and 53% (κ = 0.28; α = 0.45) for the Lodwick, modified Lodwick, and Enneking classifications, respectively. Intra-observer reproducibility also lacked strong agreement (κ = 0.42–0.45). Training level had no effect on reproducibility (R² < 0.2, p > 0.05 for all classifications). Comparison of intra-observer reproducibility showed Krippendorff’s alpha for the Lodwick (α = 0.72), modified Lodwick (α = 0.69), and Enneking classification (α = 0.63). Self-agreement for individuals ranged from 39–78%. Lesions were correctly classified as malignant for 73.3%, 59.0%, and 62% of cases for the three classification systems, respectively. Conclusions Our data demonstrate that three common classifications for osseous radiolucent lesions are neither reliable nor reproducible. Consistency of classification varied depending on lesion characteristics, with the strongest reproducibility demonstrated for the highest and lowest grades of the classification systems. There was no association between orthopedic experience and intra-observer reproducibility. These deficiencies may be improved with AI applications.

Publisher

Springer Science and Business Media LLC

Reference15 articles.

1. Radiography in the initial diagnosis of primary bone tumors;Costelloe CM;Am J Roentgenol,2013

2. Determining growth rates of focal lesions of bone from radiographs;Lodwick GS;Radiology,1980

3. Estimating rate of growth in bone lesions: observer performance and error;Lodwick GS;Radiology,1980

4. The Lodwick classification for grading growth rate of lytic bone tumors: a decision tree approach;Benndorf M;Skeletal Radiol,2022

5. A Modified Lodwick-Madewell Grading System for the Evaluation of Lytic Bone Lesions;Caracciolo JT;AJR Am J Roentgenol,2016