Author:
Manatunga Amita K,G Binongo José Nilo,Taylor Andrew T
Abstract
Abstract
Background
The accuracy of computer-aided diagnosis (CAD) software is best evaluated by comparison to a gold standard which represents the true status of disease. In many settings, however, knowledge of the true status of disease is not possible and accuracy is evaluated against the interpretations of an expert panel. Common statistical approaches to evaluate accuracy include receiver operating characteristic (ROC) and kappa analysis but both of these methods have significant limitations and cannot answer the question of equivalence: Is the CAD performance equivalent to that of an expert? The goal of this study is to show the strength of log-linear analysis over standard ROC and kappa statistics in evaluating the accuracy of computer-aided diagnosis of renal obstruction compared to the diagnosis provided by expert readers.
Methods
Log-linear modeling was utilized to analyze a previously published database that used ROC and kappa statistics to compare diuresis renography scan interpretations (non-obstructed, equivocal, or obstructed) generated by a renal expert system (RENEX) in 185 kidneys (95 patients) with the independent and consensus scan interpretations of three experts who were blinded to clinical information and prospectively and independently graded each kidney as obstructed, equivocal, or non-obstructed.
Results
Log-linear modeling showed that RENEX and the expert consensus had beyond-chance agreement in both non-obstructed and obstructed readings (both p < 0.0001). Moreover, pairwise agreement between experts and pairwise agreement between each expert and RENEX were not significantly different (p = 0.41, 0.95, 0.81 for the non-obstructed, equivocal, and obstructed categories, respectively). Similarly, the three-way agreement of the three experts and three-way agreement of two experts and RENEX was not significantly different for non-obstructed (p = 0.79) and obstructed (p = 0.49) categories.
Conclusion
Log-linear modeling showed that RENEX was equivalent to any expert in rating kidneys, particularly in the obstructed and non-obstructed categories. This conclusion, which could not be derived from the original ROC and kappa analysis, emphasizes and illustrates the role and importance of log-linear modeling in the absence of a gold standard. The log-linear analysis also provides additional evidence that RENEX has the potential to assist in the interpretation of diuresis renography studies.
Publisher
Springer Science and Business Media LLC
Subject
Radiology, Nuclear Medicine and imaging
Reference23 articles.
1. Li F, Engleman R, Metz CE, Doi K, MacMahon H: Lung cancers missed on chest radiographs: Results obtained with a commercial computer-aided detection program. Radiology 2008, 246: 273–280.
2. Taylor SA, Charmin SC, Lefere P, McFarland EG, Paulson EK, Yee J, Aslam R, Barlow JM, Gupta A, Kim DH, Miller CM, Halligan S: CT Colonography: Investigation of the optimum reader paradigm by using computer-aided detection software. Radiology 2008, 246: 463–471.
3. Iglehart J: The new era of medical imaging-progress and pitfalls. N Eng J Med 2006, 354: 2822–2828. 10.1056/NEJMhpr061219
4. IMV Medical information division: 2003 nuclear medicine census market summary report. Volume IV. IMV Limited, Des Plaines, IL; 2003:7–11.
5. Hunsche A: A value of quantitative data in the interpretation of diuresis renography for suspected urinary tract obstruction. In Ph D thesis. Federal University of Rio Grande o Sul, Porto Alegre, Rio Grande o Sul; 2006.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献