The probabilistic random forest applied to the QUBRICS survey: improving the selection of high-redshift quasars with synthetic data

Author:

Guarneri Francesco12ORCID,Calderone Giorgio2ORCID,Cristiani Stefano234,Porru Matteo1,Fontanot Fabio2ORCID,Boutsia Konstantina5ORCID,Cupani Guido23ORCID,Grazian Andrea6,D’Odorico Valentina237ORCID,Murphy Michael T38ORCID,Bongiorno Angela9,Saccheo Ivano9,Nicastro Luciano10

Affiliation:

1. Dipartimento di Fisica, Sezione di Astronomia, Università di Trieste , via G.B. Tiepolo 11, I-34131 Trieste, Italy

2. INAF – Osservatorio Astronomico di Trieste , Via G.B. Tiepolo, 11, I-34143 Trieste, Italy

3. IFPU – Institute for Fundamental Physics of the Universe , via Beirut 2, I-34151 Trieste, Italy

4. INFN – National Institute for Nuclear Physics , via Valerio 2, I-34127 Trieste, Italy

5. Las Campanas Observatory, Carnegie Observatories , Colina El Pino, Casilla 601, La Serena, Chile

6. INAF – Osservatorio Astronomico di Padova, Vicolo dell’Osservatorio 5 , I-35122 Padova, Italy

7. Scuola Normale Superiore, P.zza dei Cavalieri , I-56126 Pisa, Italy

8. Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Hawthorn, Victoria 3122, Australia

9. INAF – Osservatorio Astronomico di Roma , Via Frascati 33, I-00078 Monte Porzio Catone, Italy

10. INAF – Osservatorio di Astrofisica e Scienza dello Spazio di Bologna , Via P. Gobetti 101, I-40129 Bologna, Italy

Abstract

ABSTRACT Several recent works have focused on the search for bright, high-z quasars (QSOs) in the South. Among them, the QUasars as BRIght beacons for Cosmology in the Southern hemisphere (QUBRICS) survey has now delivered hundreds of new spectroscopically confirmed QSOs selected by means of machine learning algorithms. Building upon the results obtained by introducing the probabilistic random forest (PRF) for the QUBRICS selection, we explore in this work the feasibility of training the algorithm on synthetic data to improve the completeness in the higher redshift bins. We also compare the performances of the algorithm if colours are used as primary features instead of magnitudes. We generate synthetic data based on a composite QSO spectral energy distribution. We first train the PRF to identify QSOs among stars and galaxies, then separate high-z quasar from low-z contaminants. We apply the algorithm on an updated data set, based on SkyMapper DR3, combined with Gaia eDR3, 2MASS, and WISE magnitudes. We find that employing colours as features slightly improves the results with respect to the algorithm trained on magnitude data. Adding synthetic data to the training set provides significantly better results with respect to the PRF trained only on spectroscopically confirmed QSOs. We estimate, on a testing data set, a completeness of $\sim 86{{\ \rm per\ cent}}$ and a contamination of $\sim 36{{\ \rm per\ cent}}$. Finally, 206 PRF-selected candidates were observed: 149/206 turned out to be genuine QSOs with z > 2.5, 41 with z < 2.5, 3 galaxies and 13 stars. The result confirms the ability of the PRF to select high-z quasars in large data sets.

Funder

Istituto Nazionale di Astrofisica

ARC

Australian Research Council

University of Sydney

Australian National University

Swinburne University of Technology

University of Queensland

University of Western Australia

University of Melbourne

Curtin University of Technology

Monash University

Australian Astronomical Observatory

National Computational Infrastructure

Astronomy Australia Limited

Australian National Data Service

European Southern Observatory

ESO

European Space Agency

California Institute of Technology

National Aeronautics and Space Administration

National Science Foundation

University of California, Los Angeles

Jet Propulsion Laboratory

Publisher

Oxford University Press (OUP)

Subject

Space and Planetary Science,Astronomy and Astrophysics

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3