Abstract
The total Sharp-van der Heijde score (TSS) is crucial for assessing the joint damage severity in rheumatoid arthritis (RA). Manual scoring is often time-consuming and subjective, leading to variability. This study introduces an Automated Radiographic Sharp Scoring (ARTSS) framework that leverages deep learning to analyze full-hand X-ray images, aiming to reduce inter- and intra-observer variability. A key innovation is its ability to handle patients with joint disappearance and variable-length image sequenced. The framework involves four stages: image pre-processing with ResNet50, hand segmentation using UNet, joint identification via YOLOv7, and TSS prediction using models like VGG16, VGG19, ResNet50, DenseNet201, EfficientNetB0, and Vision Transformer (ViT). Evaluation metrics included Intersection over Union (IoU), Mean Average Precision (MAP), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Huber loss. Training used 3-fold cross-validation with 970 patients, and external testing included 291 subjects. The joint identification model achieved 99% accuracy, with ViT showing a Huber loss of 0.87 for TSS prediction. The ARTSS addresses the challenge of joint disappearance and variable joint numbers, which lead to its generalizability and applicability to clinical. This approach preserves time, reduces inter- and intra-reader variability, improves radiologist accuracy, and aids rheumatologists in making more personalized treatments.