Abstract
AbstractAlmond (Prunus dulcis [Mill.] D.A. Webb) is an economically important, specialty nut crop grown almost exclusively in the United States. Breeding and improvement efforts worldwide have led to the development of key, productive cultivars, including ‘Nonpareil,’ which is the most widely grown almond cultivar. Thus far, genomic resources for this species have been limited, and a whole-genome assembly for ‘Nonpareil’ is not currently available despite its economic importance and use in almond breeding worldwide. We generated a 615.89X coverage genome sequence using Illumina, PacBio, and optical mapping technologies. Gene prediction revealed 27,487 genes using MinION Oxford nanopore and Illumina RNA sequencing, and genome annotation found that 68% of predicted models are associated with at least one biological function. Further, epigenetic signatures of almond, namely DNA cytosine methylation, have been implicated in a variety of phenotypes including self-compatibility, bud dormancy, and development of non-infectious bud failure. In addition to the genome sequence and annotation, this report also provides the complete methylome of several key almond tissues, including leaf, flower, endocarp, mesocarp, fruit skin, and seed coat. Comparisons between methylation profiles in these tissues revealed differences in genome-wide weighted percent methylation and chromosome-level methylation enrichment. The raw sequencing data are available on NCBI Sequence Read Archive, and the complete genome sequence and annotation files are available on NCBI Genbank. All data can be used without restriction.
Publisher
Cold Spring Harbor Laboratory
Reference27 articles.
1. Alioto, T. , K. G. Alexiou , A. Bardil , F. Barteri , R. Castanera et al., 2019 Transposons played a major role in the diversification between the closely related almond and peach genomes: Results from the almond genome sequence. Plant J 14538.
2. Almond Board of California, 2020 Almond almanac: Almond Board of California annual report 2020, 45 p.
3. Andrews, S. , 2010 FastQC: A quality control tool for high throughput sequence data.
4. Phased diploid genome assembly with single-molecule real-time sequencing
5. UniProt: a worldwide hub of protein knowledge