Affiliation:
1. Mathematics and Science College Shanghai Normal University Shanghai China
2. College of Information, Mechanical and Electrical Engineering Shanghai Normal University Shanghai China
3. School of Computer Science and Technology Donghua University Shanghai China
Abstract
AbstractVisual smoke semantic segmentation is a challenging task due to semi‐transparency, variable shapes, and complex textures of smoke. To improve segmentation performance, a convolutional neural network and transformer hybrid network are proposed based on pyramid Gaussian pooling (PGP) for smoke segmentation. In order to utilize low‐pass filtering to suppress noise, a PGP method is designed. Then, the output of PGP is reshaped to construct a set of visual tokens for transformers, thus a PGP‐transformer module is presented to make full use of the self‐attention mechanism. Finally, the PGP‐transformer module is inserted into the U‐shaped architecture with skip connections. A large number of experiments have proved that the method is significantly superior to existing state‐of‐the‐art algorithms on virtual and real smoke datasets, and ablation experiments have also verified the effectiveness of the proposed modules.
Funder
National Natural Science Foundation of China
Publisher
Institution of Engineering and Technology (IET)