Interior Design Evaluation Based on Deep Learning: A Multi-Modal Fusion Evaluation Mechanism
-
Published:2024-05-16
Issue:10
Volume:12
Page:1560
-
ISSN:2227-7390
-
Container-title:Mathematics
-
language:en
-
Short-container-title:Mathematics
Author:
Fan Yiyan1, Zhou Yang2, Yuan Zheng1
Affiliation:
1. Shanghai Academy of Fine Arts, Shanghai University, Shanghai 200444, China 2. School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200444, China
Abstract
The design of 3D scenes is of great significance, and one of the crucial areas is interior scene design. This study not only pertains to the living environment of individuals but also has applications in the design and development of virtual environments. Previous work on indoor scenes has focused on understanding and editing existing indoor scenes, such as scene reconstruction, segmentation tasks, texture, object localization, and rendering. In this study, we propose a novel task in the realm of indoor scene comprehension, amalgamating interior design principles with professional evaluation criteria: 3D indoor scene design assessment. Furthermore, we propose an approach using a transformer encoder–decoder architecture and a dual-graph convolutional network. Our approach facilitates users in posing text-based inquiries; accepts input in two modalities, point cloud representations of indoor scenes and textual queries; and ultimately generates a probability distribution indicating positive, neutral, and negative assessments of interior design. The proposed method uses separately pre-trained modules, including a 3D visual question-answering module and a dual-graph convolutional network for identifying emotional tendencies of text.
Funder
National Natural Science Foundation of China
Reference37 articles.
1. Wei, Z., Zhang, J., Shen, X., Lin, Z., Mech, R., Hoai, M., and Samaras, D. (2018, January 18–23). Good view hunting: Learning photo composition from dense view pairs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 2. Xie, D., Hu, P., Sun, X., Pirk, S., Zhang, J., Mech, R., and Kaufman, A.E. (2023, January 2–6). Gait: Generating aesthetic indoor tours with deep reinforcement learning. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France. 3. Shao, Z., Yu, Z., Wang, M., and Yu, J. (2023, January 17–24). Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada. 4. Azuma, D., Miyanishi, T., Kurita, S., and Kawanabe, M. (2022, January 18–24). Scanqa: 3D question answering for spatial scene understanding. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA. 5. Parelli, M., Delitzas, A., Hars, N., Vlassis, G., Anagnostidis, S., Bachmann, G., and Hofmann, T. (2023, January 17–24). Clip-guided vision-language pre-training for question answering in 3D scenes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.
|
|