Affiliation:
1. School of Architecture and Urban Planning, Shenyang Jianzhu University, Shenyang 110168, China
2. School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
3. School of Printing, Packaging and Digital Media, Xi’an University of Technology, Xi’an 710048, China
Abstract
The extraction of features and classification of traditional dwellings plays significant roles in preserving and ensuring the sustainable development of these structures. Currently, challenges persist in subjective classification and the accuracy of feature extraction. This study focuses on traditional dwellings in Gansu Province, China, employing a novel model named Improved Swin Transformer. This model, based on the Swin Transformer and parallel grouped Convolutional Neural Networks (CNN) branches, aims to enhance the accuracy of feature extraction and classification precision. Furthermore, to validate the accuracy of feature extraction during the prediction process and foster trust in AI systems, explainability research was conducted using Grad-CAM-generated heatmaps. Initially, the Gansu Province Traditional Dwelling Dataset (GTDD) is established. On the constructed GTDD dataset, the Improved Swin Transformer attains an accuracy of 90.03% and an F1 score of 87.44%. Comparative analysis with ResNet-50, ResNeXt-50, and Swin Transformer highlights the outstanding performance of the improved model. The confusion matrix of the Improved Swin Transformer model reveals the classification results across different regions, indicating that the primary influencing factors are attributed to terrain, climate, and cultural aspects. Finally, using Grad-CAM-generated heatmaps for explaining classifications, it is observed that the Improved Swin Transformer model exhibits more accurate localization and focuses on features compared to the other three models. The model demonstrates exceptional feature extraction ability with minimal influence from the surrounding environment. Simultaneously, through the heatmaps generated by the Improved Swin Transformer for traditional residential areas in five regions of Gansu, it is evident that the model accurately extracts architectural features such as roofs, facades, materials, windows, etc. This validates the consistency of features extracted by the Improved Swin Transformer with traditional methods and enhances trust in the model and decision-making. In summary, the Improved Swin Transformer demonstrates outstanding feature extraction ability and accurate classification, providing valuable insights for the protection and style control of traditional residential areas.
Funder
National Natural Science Foundation of China
Key Research and Development Project of Shaanxi Province
Reference84 articles.
1. The Research on Traditional Dwelling Culture Geography;Yan;South Archit.,2013
2. De, Q.S. (2004). From Traditional Houses to Regional Buildings, China Building Materials Industry Press.
3. Li, L. (2023). Research on the Protection of the Residential Buildings in Traditional Village from the Cultural Prespective: A Case of Wanjian Village in Anhui. Urban. Archit., 20.
4. Banister, F. (1922). A History of Architecture on the Comparative Method, The MIT Press.
5. Research to Traditional Civil Building and Regional Culture;Pan;Shanxi Archit.,2014