Analyzing to discover origins of CNNs and ViT architectures in medical images-Reference-Cited by-同舟云学术

Analyzing to discover origins of CNNs and ViT architectures in medical images

Published:2024-04-16 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Oh Seungmin,Kim Namkug,Ryu Jongbin

Abstract

AbstractIn this paper, we introduce in-depth the analysis of CNNs and ViT architectures in medical images, with the goal of providing insights into subsequent research direction. In particular, the origins of deep neural networks should be explainable for medical images, but there has been a paucity of studies on such explainability in the aspect of deep neural network architectures. Therefore, we investigate the origin of model performance, which is the clue to explaining deep neural networks, focusing on the two most relevant architectures, such as CNNs and ViT. We give four analyses, including (1) robustness in a noisy environment, (2) consistency in translation invariance property, (3) visual recognition with obstructed images, and (4) acquired features from shape or texture so that we compare origins of CNNs and ViT that cause the differences of visual recognition performance. Furthermore, the discrepancies between medical and generic images are explored regarding such analyses. We discover that medical images, unlike generic ones, exhibit class-sensitive. Finally, we propose a straightforward ensemble method based on our analyses, demonstrating that our findings can help build follow-up studies. Our analysis code will be publicly available.

Funder

Korea Government

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-58382-3.pdf

Reference35 articles.

1. Li, J. et al. Transforming medical imaging with transformers? a comparative review of key properties, current progresses, and future perspectives. Med. Image Anal. 85, 102762 (2023).

2. Bissoto, A., Valle, E., & Avila, S. Debiasing skin lesion datasets and models? not so fast. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, (2020).

3. Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. Transfusion: Understanding transfer learning for medical imaging. Neural Inf. Proc. Syst., (2019).

4. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, (2009).

5. Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., & Dosovitskiy, A. Do vision transformers see like convolutional neural networks? Neural Inf. Proc. Syst., (2021).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Role of artificial intelligence in brain tumour imaging;European Journal of Radiology;2024-07