MiraBest: a data set of morphologically classified radio galaxies for machine learning-Reference-Cited by-同舟云学术

MiraBest: a data set of morphologically classified radio galaxies for machine learning

Published:2023-01 Issue:1 Volume:2 Page:293-306
ISSN:2752-8200
Container-title:RAS Techniques and Instruments
language:en
Short-container-title:

Author:

Porter Fiona A M¹^ORCID,Scaife Anna M M¹²^ORCID

Affiliation:

1. Jodrell Bank Centre for Astrophysics, Department of Physics & Astronomy, University of Manchester , Oxford Road, Manchester M13 9PL, UK

2. The Alan Turing Institute , Euston Road, London NW1 2DB, UK

Abstract

Abstract The volume of data from current and future observatories has motivated the increased development and application of automated machine learning methodologies for astronomy. However, less attention has been given to the production of standardized data sets for assessing the performance of different machine learning algorithms within astronomy and astrophysics. Here we describe in detail the MiraBest data set, a publicly available batched data set of 1256 radio-loud AGN from NVSS and FIRST, filtered to 0.03 < z < 0.1, manually labelled by Miraghaei and Best according to the Fanaroff–Riley morphological classification, created for machine learning applications and compatible for use with standard deep learning libraries. We outline the principles underlying the construction of the data set, the sample selection and pre-processing methodology, data set structure and composition, as well as a comparison of MiraBest to other data sets used in the literature. Existing applications that utilize the MiraBest data set are reviewed, and an extended data set of 2100 sources is created by cross-matching MiraBest with other catalogues of radio-loud AGN that have been used more widely in the literature for machine learning applications.

Funder

STFC

Alan Turing Institute

Publisher

Oxford University Press (OUP)

Link

https://academic.oup.com/rasti/advance-article-pdf/doi/10.1093/rasti/rzad017/50647162/rzad017.pdf

Reference83 articles.

1. THE SEVENTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY

2. Radio Galaxy Zoo: machine learning for radio source host galaxy cross-identification

3. Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis

4. Classifying Radio Galaxies with the Convolutional Neural Network

5. The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5.0) of the Core Package*

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Radio Galaxy Zoo: Leveraging latent space representations from variational autoencoder;Journal of Cosmology and Astroparticle Physics;2024-06-01

2. Enabling unsupervised discovery in astronomical images through self-supervised representations;Monthly Notices of the Royal Astronomical Society;2024-04-03

3. E(2)-equivariant features in machine learning for morphological classification of radio galaxies;RAS Techniques and Instruments;2024-01

4. Radio galaxy zoo: towards building the first multipurpose foundation model for radio astronomy with self-supervised learning;RAS Techniques and Instruments;2023-12-18