Finding Genes in the C2C12 Osteogenic Pathway by k-Nearest-Neighbor Classification of Expression Data

Author:

Theilhaber Joachim,Connolly Timothy,Roman-Roman Sergio,Bushnell Steven,Jackson Amanda,Call Kathy,Garcia Teresa,Baron Roland

Abstract

A supervised classification scheme for analyzing microarray expression data, based on the k-nearest-neighbor method coupled to noise-reduction filters, has been used to find genes involved in the osteogenic pathway of the mouse C2C12 cell line studied here as a model for in vivo osteogenesis. The scheme uses as input a training set embodying expert biological knowledge, and provides internal estimates of its own misclassification errors, which furthermore enables systematic optimization of the classifier parameters. On the basis of the C2C12-generated expression data set with 34,130 expression profiles across 2 time courses, each comprised of 6 points, and a training set containing known members of the osteogenic, myoblastic, and adipocytic pathways, 176 new genes in addition to 28 originally in the training set are selected as relevant to osteogenesis. For this selection, the estimated sensitivity is 42% and the posterior false-positive rate (fraction of candidates that are spurious) is 12%. The corresponding sensitivity and false-positive rate for detection of myoblastic genes are 9% and 31%, respectively, and only 4% and ∼100%, respectively, for adipocytic genes, in accordance with an experimental design that predominantly stimulated the osteogenic pathway. Validation of this selection is provided by examining expression of the genes in an independent biological assay involving mouse calvaria (skull bone) primary cell cultures, in which a large fraction of the 176 genes are seen to be strongly regulated, as well as by case-by-case analysis of the genes on the basis of expert domain knowledge. The methodology should be generalizable to any situation in which enough a priori biological knowledge exists to define a training set.[Online supplementary material available at www.genome.org]

Publisher

Cold Spring Harbor Laboratory

Subject

Genetics (clinical),Genetics

Cited by 59 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Proteomics appending a complementary dimension to precision oncotherapy;Computational and Structural Biotechnology Journal;2024-12

2. TOXICOGENOMICS;Drug Safety Evaluation;2022-12-23

3. Screening gene signatures for clinical response subtypes of lung transplantation;Molecular Genetics and Genomics;2022-07-03

4. Genomics and Machine Learning;Machine Learning in Biological Sciences;2022

5. Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features;Frontiers in Genetics;2021-11-05

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3