Author:
Reddy C. Kishor Kumar,Reddy Pulakurthi Anaghaa,Janapati Himaja,Assiri Basem,Shuaib Mohammed,Alam Shadab,Sheneamer Abdullah
Abstract
Brain tumors occur due to the expansion of abnormal cell tissues and can be malignant (cancerous) or benign (not cancerous). Numerous factors such as the position, size, and progression rate are considered while detecting and diagnosing brain tumors. Detecting brain tumors in their initial phases is vital for diagnosis where MRI (magnetic resonance imaging) scans play an important role. Over the years, deep learning models have been extensively used for medical image processing. The current study primarily investigates the novel Fine-Tuned Vision Transformer models (FTVTs)—FTVT-b16, FTVT-b32, FTVT-l16, FTVT-l32—for brain tumor classification, while also comparing them with other established deep learning models such as ResNet50, MobileNet-V2, and EfficientNet - B0. A dataset with 7,023 images (MRI scans) categorized into four different classes, namely, glioma, meningioma, pituitary, and no tumor are used for classification. Further, the study presents a comparative analysis of these models including their accuracies and other evaluation metrics including recall, precision, and F1-score across each class. The deep learning models ResNet-50, EfficientNet-B0, and MobileNet-V2 obtained an accuracy of 96.5%, 95.1%, and 94.9%, respectively. Among all the FTVT models, FTVT-l16 model achieved a remarkable accuracy of 98.70% whereas other FTVT models FTVT-b16, FTVT-b32, and FTVT-132 achieved an accuracy of 98.09%, 96.87%, 98.62%, respectively, hence proving the efficacy and robustness of FTVT’s in medical image processing.