Affiliation:
1. Department of Statistics University of Georgia Athens Georgia USA
2. Department of Animal & Dairy Science University of Georgia Athens Georgia USA
3. Department of Mathematical Sciences KAIST Daejeon South Korea
Abstract
AbstractConventional canonical correlation analysis (CCA) measures the association between two datasets and identifies relevant contributors. However, it encounters issues with execution and interpretation when the sample size is smaller than the number of variables or there are more than two datasets. Our motivating example is a stroke‐related clinical study on pigs. The data are multimodal and consist of measurements taken at multiple time points and have many more variables than observations. This study aims to uncover important biomarkers and stroke recovery patterns based on physiological changes. To address the issues in the data, we develop two sparse CCA methods for multiple datasets. Various simulated examples are used to illustrate and contrast the performance of the proposed methods with that of the existing methods. In analyzing the pig stroke data, we apply the proposed sparse CCA methods along with dimension reduction techniques, interpret the recovery patterns, and identify influential variables in recovery.
Funder
National Institutes of Health
National Research Foundation of Korea