Author:
Schoenmacker GH,Vlaming P,Pallesen J,Pikulina MY,Ghamarian AH,Demontis D,Børglum A,Galesloot TE,Poelmans G,Franke B,Claassen T,Heskes T,Buitelaar JK,Vásquez A Arias
Abstract
AbstractMotivationWith the increasing availability of genome-wide genetic data, methods to combine genetic variables with other sources of data in statistical models are required. This paper introduces quantitative genetic scoring (QGS), a dimensionality reduction method to create quantitative genetic variables representing arbitrary genetic regions.MethodsQGS is defined as the sum of absolute differences in the genetic sequence between a subject and a reference population. QGS properties such as distribution and sensitivity to region size were examined, and QGS was tested in six different existing genomic data sets of various sizes and various phenotypes.ResultsQGS can reduce genetic information by >98% yet explain phenotypic variance at low, medium, and high level of granularity. Associations based on QGS are independent of both size and linkage disequilibrium structure of the underlying region. In combination with stability selection, QGS finds significant results where a traditional genome-wide association approaches struggle. In conclusion, QGS preserves phenotypically significant genetic variance while reducing dimensionality, allowing researchers to include quantitative genetic information in any type of statistical analysis.Availabilityhttps://github.com/machine2learn/QGSContactgido.schoenmacker@radboudumc.nlSupplemental informationSupplemental data are available online.
Publisher
Cold Spring Harbor Laboratory