Abstract
Background
Studies on genome-wide associations help to determine the cause of many genetic diseases. Genome-wide associations typically focus on associations between single-nucleotide polymorphisms (SNPs). Genotyping every SNP in a chromosomal region for identifying genetic variation is computationally very expensive. A representative subset of SNPs, called tag SNPs, can be used to identify genetic variation. Small tag SNPs save the computation time of genotyping platform, however, there could be missing data or genotyping errors in small tag SNPs. This study aims to solve Tag SNPs selection problem using many-objective evolutionary algorithms.
Methods
Tag SNPs selection can be viewed as an optimization problem with some trade-offs between objectives, e.g. minimizing the number of tag SNPs and maximizing tolerance for missing data. In this study, the tag SNPs selection problem is formulated as a many-objective problem. Nondominated Sorting based Genetic Algorithm (NSGA-III), and Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D), which are Many-Objective evolutionary algorithms, have been applied and investigated for optimal tag SNPs selection. This study also investigates different initialization methods like greedy and random initialization. optimization.
Results
The evaluation measures used for comparing results for different algorithms are Hypervolume, Range, SumMin, MinSum, Tolerance rate, and Average Hamming distance. Overall MOEA/D algorithm gives superior results as compared to other algorithms in most cases. NSGA-III outperforms NSGA-II and other compared algorithms on maximum tolerance rate, and SPEA2 outperforms all algorithms on average hamming distance.
Conclusion
Experimental results show that the performance of our proposed many-objective algorithms is much superior as compared to the results of existing methods. The outcomes show the advantages of greedy initialization over random initialization using NSGA-III, SPEA2, and MOEA/D to solve the tag SNPs selection as many-objective optimization problem.
Publisher
Public Library of Science (PLoS)