Background: The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map.
Results: Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor > 98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features.
Conclusions: The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454739 | PMC |
http://dx.doi.org/10.1186/s12864-019-5642-0 | DOI Listing |
BMC Genom Data
January 2025
Department of Applied Biosciences, College of Agriculture and Life Sciences, Kyungpook National University, Daegu, 41566, Republic of Korea.
Objectives: The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper.
View Article and Find Full Text PDFCommun Biol
January 2025
College of Life Sciences, Capital Normal University, Haidian District, Beijing, China.
Phragmites australis is a globally distributed grass species (Poaceae) recognized for its vast biomass and exceptional environmental adaptability, making it an ideal model for studying wetland ecosystems and plant stress resilience. However, genomic resources for this species have been limited. In this study, we assembled a chromosome-level reference genome of P.
View Article and Find Full Text PDFMol Cell
January 2025
Department of Genetics and Development and Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY 10032, USA. Electronic address:
Cells integrate metabolic information into core molecular processes such as transcription to adapt to environmental changes. Chromatin, the physiological template of the eukaryotic genome, has emerged as a sensor and rheostat for fluctuating intracellular metabolites. In this review, we highlight the growing list of chromatin-associated metabolites that are derived from diverse sources.
View Article and Find Full Text PDFMol Cell
January 2025
Institute for Cancer Genetics and Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY 10032, USA; Department of Pediatrics and Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY 10032, USA. Electronic address:
DNA replication, a fundamental process in all living organisms, proceeds with continuous synthesis of the leading strand by DNA polymerase ε (Pol ε) and discontinuous synthesis of the lagging strand by polymerase δ (Pol δ). This inherent asymmetry at each replication fork necessitates the development of methods to distinguish between these two nascent strands in vivo. Over the past decade, strand-specific sequencing strategies, such as enrichment and sequencing of protein-associated nascent DNA (eSPAN) and Okazaki fragment sequencing (OK-seq), have become essential tools for studying chromatin replication in eukaryotic cells.
View Article and Find Full Text PDFMol Biol Evol
January 2025
CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China.
Southwest China is characterized by high plateaus, large mountain systems, and deeply incised dry valleys formed by major rivers and their tributaries. Despite the considerable attention given to alpine plant radiations in this region, the timing and mode of diversification of the numerous dry valley plant lineages remain unknown. To address this knowledge gap, we investigated the macroevolution of Isodon (Lamiaceae), a lineage commonly distributed in the dry valleys in southwest China and wetter areas of Asia and Africa.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!