Abstract
Abstract
Transit spectroscopy is a powerful tool for decoding the chemical compositions of the atmospheres of extrasolar planets. In this paper, we focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. After cleaning and validating the data, we demonstrate methods for: (i) initial exploratory data analysis, based on summary statistics (estimates of location and variability); (ii) exploring and quantifying the existing correlations in the data; (iii) preprocessing and linearly transforming the data to its principal components; (iv) dimensionality reduction and manifold learning; (v) clustering and anomaly detection; and (vi) visualization and interpretation of the data. To illustrate the proposed unsupervised methodology, we use a well-known public benchmark data set of synthetic transit spectra. We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations. We explore a number of different techniques for such dimensionality reduction and identify several suitable options in terms of summary statistics, principal components, etc. We uncover interesting structures in the principal component basis, namely well-defined branches corresponding to different chemical regimes of the underlying atmospheres. We demonstrate that those branches can be successfully recovered with a K-means clustering algorithm in a fully unsupervised fashion. We advocate for lower-dimensional representations of the spectroscopic data in terms of the main principal components, in order to reveal the existing structure in the data and quickly characterize the chemical class of a planet.
Funder
U.S. Department of Energy
Publisher
American Astronomical Society
Subject
Space and Planetary Science,Earth and Planetary Sciences (miscellaneous),Geophysics,Astronomy and Astrophysics
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献