Toward on-sky adaptive optics control using reinforcement learning

Author:

Nousiainen J.,Rajani C.,Kasper M.,Helin T.,Haffert S. Y.,Vérinaud C.,Males J. R.,Van Gorkom K.,Close L. M.,Long J. D.,Hedglen A. D.,Guyon O.,Schatz L.,Kautz M.,Lumbres J.,Rodack A.,Knight J. M.,Miller K.

Abstract

Context. The direct imaging of potentially habitable exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based, extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the habitable exoplanets are located at small angular separations from their host stars, where the current control laws of XAO systems leave strong residuals. Aims. Current AO control strategies such as static matrix-based wavefront reconstruction and integrator control suffer from a temporal delay error and are sensitive to mis-registration, that is, to dynamic variations of the control system geometry. We aim to produce control methods that cope with these limitations, provide a significantly improved AO correction, and, therefore, reduce the residual flux in the coronagraphic point spread function (PSF). Methods. We extend previous work in reinforcement learning for AO. The improved method, called the Policy Optimization for Adaptive Optics (PO4AO), learns a dynamics model and optimizes a control neural network, called a policy. We introduce the method and study it through numerical simulations of XAO with Pyramid wavefront sensor (PWFS) for the 8-m and 40-m telescope aperture cases. We further implemented PO4AO and carried out experiments in a laboratory environment using Magellan Adaptive Optics eXtreme system (MagAO-X) at the Steward laboratory. Results. PO4AO provides the desired performance by improving the coronagraphic contrast in numerical simulations by factors of 3–5 within the control region of deformable mirror and PWFS, both in simulation and in the laboratory. The presented method is also quick to train, that is, on timescales of typically 5–10 s, and the inference time is sufficiently small (<ms) to be used in real-time control for XAO with currently available hardware even for extremely large telescopes.

Publisher

EDP Sciences

Subject

Space and Planetary Science,Astronomy and Astrophysics

Reference76 articles.

1. Origin of the asymmetry of the wind driven halo observed in high-contrast images

2. Fundamental limitations on Earth-like planet detection with extremely large telescopes

3. Pyramid wavefront sensor optical gains compensation using a convolutional model

4. Chua K., Calandra R., McAllister R., & Levine S. 2018, in Advances in Neural Information Processing Systems, 4754

5. Conan J.-M., Raynaud H.A.R., Kulcsár C., Meimon S., & Sivo G. 2011, in Adaptive Optics for Extremely Large Telescopes (Singapore: World Scientific)

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3