Haploid cell lines are a valuable research tool with broad applicability for genetic assays. As such the fully haploid human cell line, eHAP1, has been used in a wide array of studies. However, the absence of a corresponding reference genome sequence for this cell line has limited the potential for more widespread applications to experiments dependent on available sequence, like capture-clone methodologies. We generated ~15× coverage Nanopore long reads from ten GridION flowcells and utilized this data to assemble a de novo draft genome using minimap and miniasm and subsequently polished using Racon. This assembly was further polished using previously generated, low-coverage, Illumina short reads with Pilon and ntEdit. This resulted in a hybrid eHAP1 assembly with >90% complete BUSCO scores. We further assessed the eHAP1 long read data for structural variants using Sniffles and identify a variety of rearrangements, including a previously established Philadelphia translocation. Finally, we demonstrate how some of these variants overlap open chromatin regions, potentially impacting regulatory regions. By integrating both long and short reads, we generated a high-quality reference assembly for eHAP1 cells. The union of long and short reads demonstrates the utility in combining sequencing platforms to generate a high-quality reference genome de novo solely from low coverage data. We expect the resulting eHAP1 genome assembly to provide a useful resource to enable novel experimental applications in this important model cell line.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10298834 | PMC |
http://dx.doi.org/10.1016/j.ygeno.2020.01.009 | DOI Listing |
Nucleic Acids Res
January 2025
Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, 99, Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea.
Large genetic variants can be generated via homologous recombination (HR), such as polymerase theta-mediated end joining (TMEJ) or single-strand annealing (SSA). Given that these HR-based mechanisms leave specific genomic signatures, we developed GDBr, a genomic signature interpretation tool for DNA double-strand break repair mechanisms using high-quality genome assemblies. We applied GDBr to a draft human pangenome reference.
View Article and Find Full Text PDFJ Diabetes Investig
January 2025
Department of Obstetrics and Gynecology, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan.
Aims: This study investigated the association between maternal age and early and late gestational diabetes mellitus (GDM).
Methods: In total, 72,270 pregnant women were included in this prospective birth cohort study. Associations between maternal age and early GDM (diagnosed at <24 gestational weeks) and late GDM (diagnosed at ≥24 gestational weeks) were evaluated using a multinomial logistic regression model with possible confounding factors.
Cancers (Basel)
January 2025
Unit of Dermatology, Department of Medicine, University of Padova, 35122 Padua, Italy.
Int J Mol Sci
January 2025
Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, 125315 Moscow, Russia.
A pseudogene is a non-functional copy of a protein-coding gene. Processed pseudogenes, which are created by the reverse transcription of mRNA and subsequent integration of the resulting cDNA into the genome, being a major pseudogene class, represent a significant challenge in genome analysis due to their high sequence similarity to the parent genes and their frequent absence in the reference genome. This homology can lead to errors in variant identification, as sequences derived from processed pseudogenes can be incorrectly assigned to parental genes, complicating correct variant calling.
View Article and Find Full Text PDFInt J Mol Sci
December 2024
Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
(Fragile X messenger ribonucleoprotein 1), located on the X-chromosome, encodes the multi-functional FMR1 protein (FMRP), critical to brain development and function. Trinucleotide CGG repeat expansions at this locus cause a range of neurological disorders, collectively referred to as Fragile X-related conditions. The most well-known of these is Fragile X syndrome, a neurodevelopmental disorder associated with syndromic facial features, autism, intellectual disabilities, and seizures.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!