Affiliation:
1. Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, Malaysia
2. Dr. Wu Lien-Teh Centre for Research in Communicable Diseases, M. Kandiah Faculty of Medicine and Health Sciences, Universiti Tunku Abdul Rahman, Jalan Sungai Long, Bandar Sungai Long, Kajang, Selangor, Malaysia
Abstract
Multilocus variable number tandem repeat analysis (MLVA) utilizes short DNA repeat polymorphism in genomes, which is termed variable number tandem repeat (VNTR), to differentiate closely related organisms. One research challenge is to find an optimal set of VNTR to distinguish different members accurately. An intuitive method is to use an exhaustive search method. However, this method is not an efficient way to find optimal solutions from a dataset comprising many attributes (loci) due to the curse of dimensionality. In this study, metaheuristic methods are proposed to find an optimal set of loci combination. Basic genetic algorithm (BGA) and modified genetic algorithm (MGA) were proposed in our previous work for this purpose. However, they require prior knowledge from an experienced user to specify the minimum number of loci for achieving good results. To impose no such expertise requirement for parameter setting, a GA with Duplicates (GAD), which allows the inclusion of duplicated loci in a chromosome (potential solution) during the search process, is developed. The study also investigates the search performance of a hybrid metaheuristic method, namely quantum-inspired differential evolution (QDE). Hunter-Gaston Discriminatory Index (HGDI) is used to indicate the discriminatory power of a loci combination. Two Mycobacterium tuberculosis MLVA datasets obtained from a public portal and a local laboratory respectively, are used. The results obtained by using exhaustive search and metaheuristic methods are first compared, followed by a performance comparison among BGA, MGA, GAD, and QDE by a statistical approach. The best-performing GA method (i.e., GAD) and QDE are selected for a performance comparison with several recent metaheuristic methods using both MLVA datasets by a statistical approach. The statistical results show that both GAD and QDE could achieve higher HGDI than the recent methods using a small but informative set of loci combination.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Reference67 articles.
1. Elaziz and A.H. Gandomi, The Arithmetic Optimization Algorithm;Abualigah;Computer Methods in Applied Mechanics and Engineering,2021
2. Environmental surveillance and molecular epidemiology of waterborne pathogen Legionella pneumophila in health-care facilities of Northeastern Greece: a 4-year survey;Alexandropoulou;Environmental Science and Pollution Research International,2015
3. Ambroise J. , Irenge L.M. , Durant J.-F. , Bearzatto B. , Bwire G. , Stine O.C. and Gala J.-L. , Backward compatibility of whole genome sequencing data with MLVA typing using a new MLVAtype shiny application for Vibrio cholerae, PLoS ONE 14(12).
4. Finding an optimal loci combination of variable number tandem repeats using genetic algorithms, in;Ang;2015 International Symposium on Technology Management and Emerging Technologies (ISTMET),2015
5. Allele-specific PCR shows that genetic exchange occurs among genetically diverse Nodularia (cyanobacteria) filaments in the Baltic Sea;Barker;Microbiology,2000