Abstract
AbstractExisting imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. After training the model on 40,000 images from the EyePACS dataset, we generated phenotypes from 130,329 images of 65,629 British White participants in the UK Biobank. We conducted GWAS on three sets of phenotypes: raw image phenotype, phenotypes derived from the original photos; retina color, the average color of the center region of the retinal fundus photos; and vessel-enriched phenotypes, phenotypes derived from vasculature-segmented images. GWAS of raw image phenotypes identified 14 loci with genome-wide significance (p<5×10-8and intersection of hits from left and right eyes), while GWAS of retina colors identified 34 loci, 7 are overlapping with GWAS of raw image phenotype. Finally, a GWAS of vessel-enriched phenotypes identified 34 loci. While 25 are overlapping with the raw image loci and color loci, 9 are unique to vessel-enriched GWAS. We found that vessel-enriched GWAS not only retains most of the loci from raw image GWAS but also discovers new loci related to vessel development. Our results establish the feasibility of this new framework of genomic study based on self-supervised phenotyping of medical images.
Publisher
Cold Spring Harbor Laboratory