Affiliation:
1. Institute of Computer Science and Computer Mathematics, Jagiellonian University , ul. Lojasiewicza 6, PL-30-348 Kraków , Poland
2. Janusz Gil Institute of Astronomy, University of Zielona Góra , ul. Szafrana 2, PL-65-516 Zielona Góra , Poland
Abstract
ABSTRACT
The Large Area Telescope (LAT) onboard the Fermi gamma-ray observatory continuously scans the sky in an energy range from 50 MeV to 1 TeV. The telescope has identified over 6000 gamma-ray emitting sources, approximately half of which are classified as active galactic nuclei (AGN). However, not all of these gamma-ray sources have known redshift values for the reason that redshift estimation following traditional methods can be an expensive, challenging task. Alternatively, as an effort to robustly predict the AGN redshift values, many researchers have recently turned to machine learning methods. However, while the focus has primarily been on predicting specific values, real-world data often allows us only to predict conditional probability distributions, constrained by conditional entropy [H(Y|X)]. In our study, we employ the Hierarchical Correlation Reconstruction approach to inexpensively predict complex conditional probability distributions, including multimodal ones. This is achieved through independent Mean Squared Error estimation of multiple moment-like parameters, combined into reconstruction of the conditional distribution. By employing linear regression for this purpose, we can develop interpretable models where coefficients describe the contributions of features to conditional moments. This article extends the original approach by incorporating Canonical Correlation Analysis for feature optimization and l1 ‘lasso’ regularization. Our primary focus is on the practical problem of predicting the redshift of AGN using data from the Fourth Fermi-LAT Data Release 3 (4LAC-DR3) data set.
Publisher
Oxford University Press (OUP)