Abstract
AbstractBackgroundThere is a need to match characteristics of tobacco users with cessation treatments and risks of tobacco attributable diseases such as lung cancer. The rate in which the body metabolizes nicotine has proven an important predictor of these outcomes. Nicotine metabolism is primarily catalyzed by the enzyme cytochrone P450 (CYP2A6) and CYP2A6 activity can be measured as the ratio of two nicotine metabolites: trans-3’-hydroxycotinine to cotinine (NMR). Measurements of these metabolites are only possible in current tobacco users and vary by biofluid source, timing of collection, and protocols; unfortunately, this has limited their use in clinical practice. The NMR depends highly on genetic variation near CYP2A6 on chromosome 19 as well as ancestry, environmental, and other genetic factors. Thus, we aimed to develop prediction models of nicotine metabolism using easy to obtain genotypes and individual characteristics.ResultsWe identified four multiethnic studies with nicotine metabolites and DNA samples. We constructed a 263 marker panel from filtering genome-wide association scans of the NMR in each study. We then applied seven machine learning techniques to train models of nicotine metabolism on the largest and most ancestrally diverse dataset (N=2239). The models were then validated out-of-sample in the other three studies (total N=1415). Using cross-validation, we found the correlations between the observed and predicted NMR ranged from 0.69 to 0.97 depending on the model. When predictions were averaged in an ensemble model, the correlation was 0.81. The ensemble model generalizes well out-of-sample across ancestries, despite differences in the measurements of NMR between studies, with correlations of: 0.52 for African ancestry, 0.61 for Asian ancestry, and 0.46 for European ancestry. The most influential predictors of NMR identified in more than two models were rs56113850, rs11878604, and 21 other genetic variants near CYP2A6 as well as age and ancestry.ConclusionsWe have developed an ensemble of seven models for predicting the NMR across ancestries from genotypes and age, gender and BMI. Predictions from these models validate out-of-sample in three datasets and associate with nicotine dosages. The knowledge of how an individual metabolizes nicotine could be used to help select the optimal path to reducing or quitting tobacco use, as well as, evaluating risks of tobacco use.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献