The miniJPAS survey: star-galaxy classification using machine learning

Author:

Baqui P. O.,Marra V.ORCID,Casarini L.,Angulo R.,Díaz-García L. A.,Hernández-Monteagudo C.,Lopes P. A. A.,López-Sanjuan C.,Muniesa D.,Placco V. M.,Quartin M.,Queiroz C.,Sobral D.,Solano E.,Tempel E.,Varela J.,Vílchez J. M.,Abramo R.,Alcaniz J.,Benitez N.,Bonoli S.,Carneiro S.,Cenarro A. J.,Cristóbal-Hornillos D.,de Amorim A. L.,de Oliveira C. M.,Dupke R.,Ederoclite A.,González Delgado R. M.,Marín-Franch A.,Moles M.,Vázquez Ramió H.,Sodré L.,Taylor K.

Abstract

Context. Future astrophysical surveys such as J-PAS will produce very large datasets, the so-called “big data”, which will require the deployment of accurate and efficient machine-learning (ML) methods. In this work, we analyze the miniJPAS survey, which observed about ∼1 deg2 of the AEGIS field with 56 narrow-band filters and 4 ugri broad-band filters. The miniJPAS primary catalog contains approximately 64 000 objects in the r detection band (magAB ≲ 24), with forced-photometry in all other filters. Aims. We discuss the classification of miniJPAS sources into extended (galaxies) and point-like (e.g., stars) objects, which is a step required for the subsequent scientific analyses. We aim at developing an ML classifier that is complementary to traditional tools that are based on explicit modeling. In particular, our goal is to release a value-added catalog with our best classification. Methods. In order to train and test our classifiers, we cross-matched the miniJPAS dataset with SDSS and HSC-SSP data, whose classification is trustworthy within the intervals 15 ≤ r ≤ 20 and 18.5 ≤ r ≤ 23.5, respectively. We trained and tested six different ML algorithms on the two cross-matched catalogs: K-nearest neighbors, decision trees, random forest (RF), artificial neural networks, extremely randomized trees (ERT), and an ensemble classifier. This last is a hybrid algorithm that combines artificial neural networks and RF with the J-PAS stellar and galactic loci classifier. As input for the ML algorithms we used the magnitudes from the 60 filters together with their errors, with and without the morphological parameters. We also used the mean point spread function in the r detection band for each pointing. Results. We find that the RF and ERT algorithms perform best in all scenarios. When the full magnitude range of 15 ≤ r ≤ 23.5 is analyzed, we find an area under the curve AUC = 0.957 with RF when photometric information alone is used, and AUC = 0.986 with ERT when photometric and morphological information is used together. When morphological parameters are used, the full width at half maximum is the most important feature. When photometric information is used alone, we observe that broad bands are not necessarily more important than narrow bands, and errors (the width of the distribution) are as important as the measurements (central value of the distribution). In other words, it is apparently important to fully characterize the measurement. Conclusions. ML algorithms can compete with traditional star and galaxy classifiers; they outperform the latter at fainter magnitudes (r ≳ 21). We use our best classifiers, with and without morphology, in order to produce a value-added catalog.

Publisher

EDP Sciences

Subject

Space and Planetary Science,Astronomy and Astrophysics

Cited by 31 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3