Abstract
As the Gene Ontology (GO) knowledgebase becomes more and more complicated, it is difficult for researchers to follow and get a comprehensive overview of biological processes. Here, we generated a classification strategy through carefully investigating the genes any two terms shared. Using this strategy, we categorized the 66 direct child terms of the cellular process into 12 major subsets, and the interactions among them were further confirmed by studying the protein-protein interaction based networks. Subsequently, these 12 subsets were used to investigate the distribution of transcription factors, kinases and also several cancer genomes. Above all, the 12-GO-subsets provide researchers a more comprehensive overview of the cellular process, and the categorizing strategy developed herein can be utilized to characterize other large GO terms.