BACKGROUND
While ChatGPT is user-friendly and widely accessible, concerns arise regarding potential delays in diagnosis and false reassurances for patients with suspected skin malignancies.
OBJECTIVE
Our study aims to assess the accuracy of AI, specifically ChatGPT, in diagnosing skin malignancies and expressing the urgency to seek medical advice.
METHODS
This diagnostic accuracy study assesses the agreement between dermatologists' final diagnoses and those provided by ChatGPT when patients describe their lesions. Thirty-five patients, suspected of skin cancer (SCC/BCC), provided demographic details and lesion descriptions. Diagnoses were recorded in ChatGPT3.5 and ChatGPT4.0 for analysis.
RESULTS
Out of 35 lesions suspected by the dermatologist, all were malignant, indicating 100% accuracy. ChatGPT3.5 flagged malignancy in 7 cases (20%), while ChatGPT4.0 did so in 6 cases (17.14%). Consistency was lacking, as only 7 lesions received the same diagnosis from both models. However, both ChatGPT3.5 and ChatGPT4.0 referred patients to dermatologists in all cases.
CONCLUSIONS
Both GPT models performed comparably to each other but were significantly inferior to dermatologists. However, both did not cause delays in referral to a dermatologist.
The limitations of these two models include poor accuracy, lack of concordance among each other’s, and reproducibility issues with their answers.