Abstract
Copy number variation is an important source of genetic variation, yet data are often lacking due to technical limitations for detection given the current genome assemblies. Our goal is to demonstrate the extent to which an array-based platform (aCGH) can identify genomic loci that are collapsed in genome assemblies that were built with short-read technology. Taking advantage of two cichlid species for which genome assemblies based on Illumina and PacBio are available, we show that inter-species aCGH log2 hybridization ratios correlate more strongly with inferred copy number differences based on PacBio-built genome assemblies than based on Illumina-built genome assemblies. With regard to inter-species copy number differences of specific genes identified by each platform, the set identified by aCGH intersects to a greater extent with the set identified by PacBio than with the set identified by Illumina. Gene function, according to Gene Ontology analysis, did not substantially differ among platforms, and platforms converged on functions associated with adaptive phenotypes. The results of the current study further demonstrate that aCGH is an effective platform for identifying copy number variable sequences, particularly those collapsed in short read genome assemblies.
Funder
Division of Environmental Biology
National Institute of General Medical Sciences
Publisher
Public Library of Science (PLoS)