Abstract
Abstract
Objective. The accurate automatic segmentation of tumors from computed tomography (CT) volumes facilitates early diagnosis and treatment of patients. A significant challenge in tumor segmentation is the integration of the spatial correlations among multiple parts of a CT volume and the context relationship across multiple channels. Approach. We proposed a mutually enhanced multi-view information model (MEMI) to propagate and fuse the spatial correlations and the context relationship and then apply it to lung tumor CT segmentation. First, a feature map was obtained from segmentation backbone encoder, which contained many image region nodes. An attention mechanism from the region node perspective was presented to determine the impact of all the other nodes on a specific node and enhance the node attribute embedding. A gated convolution-based strategy was also designed to integrate the enhanced attributes and the original node features. Second, transformer across multiple channels was constructed to integrate the channel context relationship. Finally, since the encoded node attributes from the gated convolution view and those from the channel transformer view were complementary, an interaction attention mechanism was proposed to propagate the mutual information among the multiple views. Main results. The segmentation performance was evaluated on both public lung tumor dataset and private dataset collected from a hospital. The experimental results demonstrated that MEMI was superior to other compared segmentation methods. Ablation studies showed the contributions of node correlation learning, channel context relationship learning, and mutual information interaction across multiple views to the improved segmentation performance. Utilizing MEMI on multiple segmentation backbones also demonstrated MEMI's generalization ability. Significance. Our model improved the lung tumor segmentation performance by learning the correlations among multiple region nodes, integrating the channel context relationship, and mutual information enhancement from multiple views.
Funder
Natural Science Foundation of Heilongjiang Province
STU Scientific Research Initiation Grant
National Natural Science Foundation of China