Affiliation:
1. Beijing Planetarium, Beijing Academy of Science and Technology, Beijing 100044, China
2. Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing 100101, China
Abstract
With the development of large-scale sky surveys, an increasing number of stellar photometric images have been obtained. However, most stars lack spectroscopic data, which hinders stellar classification. Vision Transformer (ViT) has shown superior performance in image classification tasks compared to most convolutional neural networks (CNNs). In this study, we propose an stellar classification network based on the Transformer architecture, named stellar-ViT, aiming to efficiently and accurately classify the spectral class for stars when provided with photometric images. By utilizing RGB images synthesized from photometric data provided by the Sloan Digital Sky Survey (SDSS), our model can distinguish the seven main stellar categories: O, B, A, F, G, K, and M. Particularly, our stellar-ViT-gri model, which reaches an accuracy of 0.839, outperforms traditional CNNs and the current state-of-the-art stellar classification network SCNet when processing RGB images synthesized from the gri bands. Furthermore, with the introduction of urz band data, the overall accuracy of the stellar-ViT model reaches 0.863, further demonstrating the importance of additional band information in improving classification performance. Our approach showcases the effectiveness and feasibility of using photometric images and Transformers for stellar classification through simple data augmentation strategies and robustness analysis of training dataset sizes. The stellar-ViT model maintains good performance even in small sample scenarios, and the inclusion of urz band data reduces the likelihood of misclassifying samples as lower-temperature subtypes.
Funder
National Science Foundation of China
Innovation Project of Beijing Academy of Science and Technology