Affiliation:
1. National Soybean Improvement Center Shijiazhuang Sub- Center, Hebei Academy of Agricultural and Forestry Sciences
2. Guangzhou University
Abstract
Abstract
Objectives:
Soybean is an important feed and oil crop in the world due to its high protein and oil content. China has a collection of more than 43,000 soybean germplasm resources, which presents a rich genetic diversity for soybean breeding. However, this diversity poses great challenges to the genetic improvement of soybean. This study reports on the de novo genome assembly of HJ117, a soybean variety with high protein content of 52%. These findings will prove to be valuable resources for further soybean quality improvement research, and will aid in the elucidation of regulatory mechanisms underlying soybean protein content.
Data description:
We generated a contiguous reference genome of 1041.94 Mb for our sample using a combination of Illumina short reads (23.38 Gb) and PacBio long reads (25.58 Gb), with high-quality sequence coverage of approximately 22.44× and 24.55×, respectively. The assembly was further assisted by 114.5 Gb Hi-C data (109.9×), resulting in a contig N50 of 19.32 Mb and scaffold N50 of 51.43 Mb. Notably, Core Eukaryotic Genes Mapping Approach (CEGMA) assessment and Benchmarking Universal Single-Copy Orthologs (BUSCO) assessment results indicated that most core eukaryotic genes (97.18%) and genes in the BUSCO dataset (99.4%) were identified, and 96.44% of the genomic sequences were anchored onto twenty pseudochromosomes.
Publisher
Research Square Platform LLC