O-Net: A Novel Framework With Deep Fusion of CNN and Transformer for Simultaneous Segmentation and Classification

Author:

Wang Tao,Lan Junlin,Han Zixin,Hu Ziwei,Huang Yuxiu,Deng Yanglin,Zhang Hejun,Wang Jianchao,Chen Musheng,Jiang Haiyan,Lee Ren-Guey,Gao Qinquan,Du Ming,Tong Tong,Chen Gang

Abstract

The application of deep learning in the medical field has continuously made huge breakthroughs in recent years. Based on convolutional neural network (CNN), the U-Net framework has become the benchmark of the medical image segmentation task. However, this framework cannot fully learn global information and remote semantic information. The transformer structure has been demonstrated to capture global information relatively better than the U-Net, but the ability to learn local information is not as good as CNN. Therefore, we propose a novel network referred to as the O-Net, which combines the advantages of CNN and transformer to fully use both the global and the local information for improving medical image segmentation and classification. In the encoder part of our proposed O-Net framework, we combine the CNN and the Swin Transformer to acquire both global and local contextual features. In the decoder part, the results of the Swin Transformer and the CNN blocks are fused to get the final results. We have evaluated the proposed network on the synapse multi-organ CT dataset and the ISIC 2017 challenge dataset for the segmentation task. The classification network is simultaneously trained by using the encoder weights of the segmentation network. The experimental results show that our proposed O-Net achieves superior segmentation performance than state-of-the-art approaches, and the segmentation results are beneficial for improving the accuracy of the classification task. The codes and models of this study are available at https://github.com/ortonwang/O-Net.

Publisher

Frontiers Media SA

Subject

General Neuroscience

Reference55 articles.

1. Swin-unet: unet-like pure transformer for medical image segmentation;Cao,2021

2. Transunet: Transformers make strong encoders for medical image segmentation;Chen,2021

3. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs;Chen;IEEE Trans. Pattern Anal. Mach. Intell,2017

4. “Encoder-decoder with atrous separable convolution for semantic image segmentation,”;Chen,2018

5. “3d u-net: learning dense volumetric segmentation from sparse annotation,”;Çiçek,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3