Abstract
AbstractBackgroundUndigested components of the human diet affect the composition and function of the microorganisms present in the gastrointestinal tract. Techniques like metagenomic analyses allow researchers to study functional capacity, thus, revealing the potential of using metagenomic data for developing objective biomarkers of food intake.ObjectiveAs a continuation of our previous work using 16S and metabolomic datasets, we aimed to utilize a computationally intensive, multivariate, machine learning approach to identify fecal Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) categories as biomarkers that accurately classify food intake.DesignData were aggregated from five controlled feeding studies that studied the individual impact of almonds, avocados, broccoli, walnuts, barley, and oats on the adult gastrointestinal microbiota. DNA from pre-and post-intervention fecal samples underwent shotgun genomic sequencing. After pre-processing, sequences were aligned and functionally annotated with DIAMOND v2.0.11.149 and MEGAN v6.12.2, respectively. After count normalization, the log of the fold change ratio for resulting KOs between pre-and post-intervention of the treatment group against its corresponding control was utilized to conduct differential abundance analysis. Differentially abundant KOs were used to train machine learning models examining potential biomarkers in both single-food and multi-food models.ResultsWe identified differentially abundant KOs in the almond (n = 54), broccoli (n = 2,474), and walnut (n = 732) groups (q< 0.20), which demonstrated classification accuracies of 80%, 87%, and 86% for the almond, broccoli, and walnut groups, respectively, using a random forest model to classify food intake into each food group’s respective treatment and control arms. The mixed-food random forest achieved 81% accuracy.ConclusionsOur findings reveal promise in utilizing fecal metagenomics to objectively complement self-reported measures of food intake. Future research on various foods and dietary patterns will expand these exploratory analyses for eventual use in feeding study compliance and clinical settings.
Publisher
Cold Spring Harbor Laboratory