Author:
Fu Chao,Cui Yiyang,Li Jing,Yu Jing,Wang Yan,Si Caifeng,Cui Kefei
Abstract
ObjectiveTo evaluate whether the categorization methods of risk stratification systems (RSSs) is a decisive factor that influenced the diagnostic performances and unnecessary FNA rates in order to choose optimal RSS for the management of thyroid nodules.MethodsFrom July 2013 to January 2019, 2667 patients with 3944 thyroid nodules had undergone pathological diagnosis after thyroidectomy and/or US-guided FNA. US categories were assigned according to the six RSSs. The diagnostic performances and unnecessary FNA rates were calculated and compared according to the US-based final assessment categories and the unified size thresholds for biopsy proposed by ACR-TIRADS, respectively.ResultsA total of 1781 (45.2%) thyroid nodules were diagnosed as malignant after thyroidectomy or biopsy. Significantly lowest specificity and accuracy, along with the highest unnecessary FNA rates were seen in EU-TIRADS for both US categories (47.9%, 70.2%, and 39.4%, respectively, all P < 0.05) and indications for FNA (54.2%, 50.0%, and 55.4%, respectively, all P < 0.05). Diagnostic performances for US-based final assessment categories exhibited similar accuracy for AI-TIRADS, Kwak-TIRADS, C-TIRADS, and ATA guidelines (78.0%, 77.8%, 77.9%, and 76.3%, respectively, all P > 0.05), while the lowest unnecessary FNA rate was seen in C-TIRADS (30.9%) and without significant differences to that of AI-TIRADS, Kwak-TIRADS, and ATA guideline (31.5%, 31.7%, and 33.6%, respectively, all P > 0.05). Diagnostic performance for US-FNA indications showed similar accuracy for ACR-TIRADS, Kwak-TIRADS, C-TIRADS and ATA guidelines (58.0%, 59.7%, 58.7%, and 57.1%, respectively, all P > 0.05). The highest accuracy and lowest unnecessary FNA rate were seen in AI-TIRADS (61.9%, 38.6%) and without significant differences to that of Kwak-TIRADS(59.7%, 42.9%) and C-TIRADS 58.7%, 43.9%, all P > 0.05).ConclusionThe different US categorization methods used by each RSS were not determinant influential factors in diagnostic performance and unnecessary FNA rate. For daily clinical practice, the score-based counting RSS was an optimal choice.