Abstract
AbstractGenomic biotechnologies have seen rapid development over the past two decades, allowing for both the inference and modification of genetic and epigenetic information at the single cell level. While these tools present enormous potential for basic research, diagnostics, and treatment, they also raise difficult issues of how to design research studies to deploy these tools most effectively. In designing a study at the population or individual level, a researcher might combine several different sequencing modalities and sampling protocols, each with different utility, costs, and other tradeoffs. The central problem this paper attempts to address is then how one might create an optimal study design for a genomic analysis, with particular focus on studies involving somatic variation, typically for applications in cancer genomics. We pose the study design problem as a stochastic constrained nonlinear optimization problem and introduce a simulation-centered optimization procedure that iteratively optimizes the objective function using surrogate modeling combined with pattern and gradient search. Finally, we demonstrate the use of our procedure on diverse test cases to derive resource and study design allocations optimized for various objectives for the study of somatic cell populations.
Publisher
Cold Spring Harbor Laboratory