FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer-Reference-Cited by-同舟云学术

FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer

Published:2024-05-03 Issue:1 Volume:7 Page:
ISSN:2398-6352
Container-title:npj Digital Medicine
language:en
Short-container-title:npj Digit. Med.

Author:

Chen Xiaolan^ORCID,Zhang Weiyi,Xu Pusheng,Zhao Ziwei,Zheng Yingfeng^ORCID,Shi Danli^ORCID,He Mingguang^ORCID

Abstract

AbstractFundus fluorescein angiography (FFA) is a crucial diagnostic tool for chorioretinal diseases, but its interpretation requires significant expertise and time. Prior studies have used Artificial Intelligence (AI)-based systems to assist FFA interpretation, but these systems lack user interaction and comprehensive evaluation by ophthalmologists. Here, we used large language models (LLMs) to develop an automated interpretation pipeline for both report generation and medical question-answering (QA) for FFA images. The pipeline comprises two parts: an image-text alignment module (Bootstrapping Language-Image Pre-training) for report generation and an LLM (Llama 2) for interactive QA. The model was developed using 654,343 FFA images with 9392 reports. It was evaluated both automatically, using language-based and classification-based metrics, and manually by three experienced ophthalmologists. The automatic evaluation of the generated reports demonstrated that the system can generate coherent and comprehensible free-text reports, achieving a BERTScore of 0.70 and F1 scores ranging from 0.64 to 0.82 for detecting top-5 retinal conditions. The manual evaluation revealed acceptable accuracy (68.3%, Kappa 0.746) and completeness (62.3%, Kappa 0.739) of the generated reports. The generated free-form answers were evaluated manually, with the majority meeting the ophthalmologists’ criteria (error-free: 70.7%, complete: 84.0%, harmless: 93.7%, satisfied: 65.3%, Kappa: 0.762–0.834). This study introduces an innovative framework that combines multi-modal transformers and LLMs, enhancing ophthalmic image interpretation, and facilitating interactive communications during medical consultation.

Funder

Start-up Fund for RAPs under the Strategic Hiring Scheme

National Natural Science Foundation of China

Global STEM Professorship Scheme from HKSAR

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41746-024-01101-z.pdf

Reference41 articles.

1. Kvopka, M., Chan, W., Lake, S. R., Durkin, S. & Taranath, D. Fundus fluorescein angiography imaging of retinopathy of prematurity in infants: A review. Surv. Ophthalmol. 68, 849–860 (2023).

2. Jin, K. et al. Automatic detection of non-perfusion areas in diabetic macular edema from fundus fluorescein angiography for decision making using deep learning. Sci. Rep. 10, 15138 (2020).

3. Stefanini, M. et al. From Show to Tell: A Survey on Deep Learning-Based Image Captioning. IEEE Trans. pattern Anal. Mach. Intell. 45, 539–559 (2023).

4. Lin, Z. et al. Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation. J. Biomed. Inform. 138, 104281 (2023).

5. Li, M. et al. Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20624-20633 https://doi.org/10.1109/CVPR52688.2022.02000 (2022).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ChatFFA: An ophthalmic chat system for unified vision-language understanding and question answering for fundus fluorescein angiography;iScience;2024-07

2. Understanding natural language: Potential application of large language models to ophthalmology;Asia-Pacific Journal of Ophthalmology;2024-07

3. Unveiling the clinical incapabilities: a benchmarking study of GPT-4V(ision) for ophthalmic multimodal image analysis;British Journal of Ophthalmology;2024-05-24