Background: The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of sequencing technologies with increasingly long reads provide the opportunity for an improved de novo assembly of the sheep reference genome.
Findings: Short-read Illumina (55× coverage), long-read Pacific Biosciences (75× coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50× coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuitev15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with 2 rounds of a pipeline that consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly is 2.63 Gb in length and has improved continuity (contig NG50 of 43.18 Mb), with a 19- and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies.
Conclusions: The ARS-UI_Ramb_v2.0 assembly is a substantial improvement in contiguity that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848310 | PMC |
http://dx.doi.org/10.1093/gigascience/giab096 | DOI Listing |
Nucleic Acids Res
January 2025
Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, 99, Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea.
Large genetic variants can be generated via homologous recombination (HR), such as polymerase theta-mediated end joining (TMEJ) or single-strand annealing (SSA). Given that these HR-based mechanisms leave specific genomic signatures, we developed GDBr, a genomic signature interpretation tool for DNA double-strand break repair mechanisms using high-quality genome assemblies. We applied GDBr to a draft human pangenome reference.
View Article and Find Full Text PDFJ Diabetes Investig
January 2025
Department of Obstetrics and Gynecology, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan.
Aims: This study investigated the association between maternal age and early and late gestational diabetes mellitus (GDM).
Methods: In total, 72,270 pregnant women were included in this prospective birth cohort study. Associations between maternal age and early GDM (diagnosed at <24 gestational weeks) and late GDM (diagnosed at ≥24 gestational weeks) were evaluated using a multinomial logistic regression model with possible confounding factors.
Cancers (Basel)
January 2025
Unit of Dermatology, Department of Medicine, University of Padova, 35122 Padua, Italy.
Int J Mol Sci
January 2025
Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, 125315 Moscow, Russia.
A pseudogene is a non-functional copy of a protein-coding gene. Processed pseudogenes, which are created by the reverse transcription of mRNA and subsequent integration of the resulting cDNA into the genome, being a major pseudogene class, represent a significant challenge in genome analysis due to their high sequence similarity to the parent genes and their frequent absence in the reference genome. This homology can lead to errors in variant identification, as sequences derived from processed pseudogenes can be incorrectly assigned to parental genes, complicating correct variant calling.
View Article and Find Full Text PDFInt J Mol Sci
December 2024
Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
(Fragile X messenger ribonucleoprotein 1), located on the X-chromosome, encodes the multi-functional FMR1 protein (FMRP), critical to brain development and function. Trinucleotide CGG repeat expansions at this locus cause a range of neurological disorders, collectively referred to as Fragile X-related conditions. The most well-known of these is Fragile X syndrome, a neurodevelopmental disorder associated with syndromic facial features, autism, intellectual disabilities, and seizures.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!