Harmonizing and aligning M/EEG datasets with covariance-based techniques to enhance predictive regression modeling

Author:

Mellot Apolline1,Collas Antoine1,Rodrigues Pedro L. C.2,Engemann Denis13,Gramfort Alexandre1

Affiliation:

1. Université Paris-Saclay, Inria, CEA, Palaiseau, France

2. Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France

3. Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center Basel, F. Hoffmann–La Roche Ltd., Basel, Switzerland

Abstract

Abstract Neuroscience studies face challenges in gathering large datasets, which limits the use of machine learning (ML) approaches. One possible solution is to incorporate additional data from large public datasets; however, data collected in different contexts often exhibit systematic differences called dataset shifts. Various factors, for example, site, device type, experimental protocol, or social characteristics, can lead to substantial divergence of brain signals that can hinder the success of ML across datasets. In this work, we focus on dataset shifts in recordings of brain activity using MEG and EEG. State-of-the-art predictive approaches on magneto- and electroencephalography (M/EEG) signals classically represent the data by covariance matrices. Model-based dataset alignment methods can leverage the geometry of covariance matrices, leading to three steps: re-centering, re-scaling, and rotation correction. This work explains theoretically how differences in brain activity, anatomy, or device configuration lead to certain shifts in data covariances. Using controlled simulations, the different alignment methods are evaluated. Their practical relevance is evaluated for brain age prediction on one MEG dataset (Cam-CAN, n = 646) and two EEG datasets (TUAB, n = 1385; LEMON, n = 213). Among the same dataset (Cam-CAN), when training and test recordings were from the same subjects but performing different tasks, paired rotation correction was essential (δR2=+0.13 (rest-passive) or +0.17 (rest-smt)). When in addition to different tasks we included unseen subjects, re-centering led to improved performance (δR2=+0.096 for rest-passive, δR2=+0.045 for rest-smt). For generalization to an independent dataset sampled from a different population and recorded with a different device, re-centering was necessary to achieve brain age prediction performance close to within dataset prediction performance. This study demonstrates that the generalization of M/EEG-based regression models across datasets can be substantially enhanced by applying domain adaptation procedures that can statistically harmonize diverse datasets.

Publisher

MIT Press

Reference58 articles.

1. Predicting age from brain EEG signals—A machine learning approach;Al Zoubi;Frontiers in Aging Neuroscience,2018

2. EEG-based measurement system for monitoring student engagement in learning 4.0;Apicella;Scientific Reports,2022

3. MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis;Appelhoff;The Journal of Open Source Software,2019

4. A mind-brain-body dataset of MRI, EEG, cognition, emotion, and peripheral physiology in young and old adults;Babayan;Scientific Data,2019

5. Barachant A., Barthélemy Q., King J.-R., Gramfort A., Chevallier S., Rodrigues P. L. C., Olivetti E., Goncharenko V., vom Berg G. W., Reguig G., Lebeurrier A., Bjäreholt E., Yamamoto M. S., Clisson P., & Corsi M.-C. (2023). pyRiemann/pyRiemann: v0.5.Zenodo, v0.5. https://doi.org/10.5281/zenodo.8059038

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3