5,605 results match your criteria: "Genome research[Journal]"

High-throughput sequencing (HTS) technologies have been instrumental in investigating biological questions at the bulk and single-cell levels. Comparative analysis of two HTS data sets often relies on testing the statistical significance for the difference of two negative binomial distributions (DOTNB). Although negative binomial distributions are well studied, the theoretical results for DOTNB remain largely unexplored.

View Article and Find Full Text PDF

Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework.

Genome Res

November 2024

Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA;

Direct nanopore-based RNA sequencing can be used to detect posttranscriptional base modifications, such as N6-methyladenosine (m6A) methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder-decoder framework that delivers a direct methylation-distinguishing basecaller by training on synthetic RNA data and immunoprecipitation (IP)-based experimental data in two steps.

View Article and Find Full Text PDF

Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission and identify specific plasmids implicated in the spread of (the carbapenemase [KPC] gene). Six hundred and five CRE isolates collected between 2009 and 2018 first underwent Illumina sequencing for genome-wide genotyping; 435 -positive isolates were then successfully nanopore sequenced to generate hybrid assemblies including circularized -harboring plasmids.

View Article and Find Full Text PDF

Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications.

Genome Res

November 2024

Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China;

DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads.

View Article and Find Full Text PDF
Article Synopsis
  • * The 1000 Genomes Project and Oxford Nanopore Technologies are working together to produce LRS data from at least 800 samples to enhance the identification of genetic variations and better understand human genetic diversity.
  • * Initial analysis of 100 samples shows high accuracy in detecting genetic variants, including structural variants that disrupt gene function, and provides valuable data for the clinical genetics community to advance research on pathogenic variations.
View Article and Find Full Text PDF

Accurate predictive models of future disease onset are crucial for effective preventive healthcare, yet longitudinal data sets linking early risk factors to subsequent health outcomes are limited. To overcome this challenge, we introduce a novel framework, redictive sk modeling using ndelian andomization (PRiMeR), which utilizes genetic effects as supervisory signals to learn disease risk predictors without relying on longitudinal data. To do so, PRiMeR leverages risk factors and genetic data from a healthy cohort, along with results from genome-wide association studies of diseases of interest.

View Article and Find Full Text PDF
Article Synopsis
  • The human major histocompatibility complex (MHC) is a crucial part of the immune system, located on Chromosome 6, and is involved in various health traits and diseases, but it's complex to study.
  • A new method using long-read sequencing technologies allows for precise targeted sequencing and haplotypic assembly of the MHC region in samples with two different alleles.
  • The approach has been tested successfully, showing high coverage and accuracy, making it a cost-effective alternative to whole-genome sequencing that could advance research in immunology and genetics.
View Article and Find Full Text PDF
Article Synopsis
  • The poly(A) signal is critical for determining the 3' end of mature mRNA transcripts, with the common signal in humans being the AAUAAA hexamer, while a unique hexamer (AGURAA) was discovered in a deeply branching eukaryote.
  • Research showed that the AAUAAA signal is likely ancestral, appearing in at least four different eukaryotic clades, indicating its evolutionary significance.
  • Findings also highlighted that auxiliary elements influencing cleavage sites are variable and can differ within species, suggesting that the processes governing gene expression are dynamic and warrant further study for potential therapeutic applications against eukaryotic pathogens.
View Article and Find Full Text PDF

The COVID-19 pandemic has highlighted the critical role of genomic surveillance for guiding policy and control. Timeliness is key, but sequence alignment and phylogeny slow most surveillance techniques. Millions of SARS-CoV-2 genomes have been assembled.

View Article and Find Full Text PDF

Systematic identification of interchromosomal interaction networks supports the existence of specialized RNA factories.

Genome Res

October 2024

Molecular Biotechnology Center "Guido Tarone," Department of Molecular Biotechnology and Health Sciences, University of Turin, 10126 Torino, Italy

Article Synopsis
  • Scientists usually study how parts of the same chromosome (intrachromosomal contacts) connect, but not much about how different chromosomes (interchromosomal contacts) interact.
  • They created a new computer method called trans-C that helps to find these important connections using data from experiments called Hi-C.
  • This method was tested with different models and showed that genes that work together often stay close to each other in the cell, which helps in making RNA better and faster.
View Article and Find Full Text PDF

Designing realistic regulatory DNA with autoregressive language models.

Genome Res

October 2024

Biology Research|AI Development, gRED Computational Sciences, Genentech, South San Francisco, California 94080, USA;

-regulatory elements (CREs), such as promoters and enhancers, are DNA sequences that regulate the expression of genes. The activity of a CRE is influenced by the order, composition, and spacing of sequence motifs that are bound by proteins called transcription factors (TFs). Synthetic CREs with specific properties are needed for biomanufacturing as well as for many therapeutic applications including cell and gene therapy.

View Article and Find Full Text PDF

The transcription factor (TF) cone-rod homeobox (CRX) is essential for the differentiation and maintenance of photoreceptor cell identity. Several human variants cause degenerative retinopathies, but most are variants of uncertain significance. We performed a deep mutational scan (DMS) of nearly all possible single amino acid substitutions in CRX using a cell-based transcriptional reporter assay, curating a high-confidence list of nearly 2000 variants with altered transcriptional activity.

View Article and Find Full Text PDF

Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads.

Genome Res

October 2024

Department of Computer and Information Sciences, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;

Mapping transcriptomic variations using either short- or long-read RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, whereas short reads still provide improved coverage and error rates. Yet, open questions remain, such as how to quantitatively compare the technologies, can we combine them, and what is the benefit of such a combined view? We tackle these questions by first creating a pipeline to assess matched long- and short-read data using a variety of transcriptome statistics.

View Article and Find Full Text PDF

The 10q11.22 chromosomal region is a duplication-rich interval of the human genome and one of the last to be fully assembled. It carries copy number-variable genes associated with intellectual disability, bipolar disorder, and obesity.

View Article and Find Full Text PDF
Article Synopsis
  • Long-read genome sequencing (lrGS) offers more accurate and comprehensive variant detection for rare diseases compared to short-read genome sequencing (srGS), though its exact impact on diagnostic yield remains unclear.
  • In a study involving 96 individuals suspected of having genetic rare diseases, lrGS identified new or potentially relevant genetic variants in 16.7% of participants, with 9.4% possessing pathogenic or likely pathogenic variants.
  • While lrGS provided additional insights beyond what was captured by srGS, particularly with structural variations, the study suggests that growing lrGS datasets will further enhance diagnostic capabilities in the future.
View Article and Find Full Text PDF

Interspecies hybridization is prevalent in various eukaryotic lineages and plays important roles in phenotypic diversification, adaptation, and speciation. To better understand the changes that occurred in the different subgenomes of a hybrid species and how they facilitate adaptation, we have completed chromosome-level de novo assemblies of all chromosomes for a recently formed hybrid yeast, strain CBS380, using Oxford Nanopore Technologies' MinION long-read sequencing. We characterize the genome and compare it with its parent species, and , and other genomes to better understand genome evolution after a relatively recent hybridization event.

View Article and Find Full Text PDF

The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcript enrichment protocol with 5' CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations and subspecies of , and from the closely related sister species and as outgroups.

View Article and Find Full Text PDF

The use of long-read direct RNA sequencing (DRS) and PCR cDNA sequencing (PCS) in clinical oncology remains limited, with no direct comparison between the two methods. We used DRS and PCS to study clear cell renal cell carcinoma (ccRCC), focusing on new transcript and gene discovery. Twelve primary ccRCC archival tumors, six from patients who went on to relapse, were analyzed.

View Article and Find Full Text PDF

X-linked genetic disorders typically affect females less severely than males owing to the presence of a second X Chromosome not carrying the deleterious variant. However, the phenotypic expression in females is highly variable, which may be explained by an allelic skew in X-Chromosome inactivation. Accurate measurement of X inactivation skew is crucial to understand and predict disease phenotype in carrier females, with prediction especially relevant for degenerative conditions.

View Article and Find Full Text PDF
Article Synopsis
  • Nanopore direct RNA sequencing (DRS) is an effective method for analyzing RNA modifications, particularly in studying the epitranscriptome, as it can identify various modifications in full-length native RNA.
  • However, traditional basecalling models struggle with detecting -methyladenosine (mA) modifications since they often lead to low error signals, which makes it challenging to identify these modifications at lower abundance.
  • The study shows that alternative RNA basecalling models trained on unmodified sequences can significantly enhance the detection of mA and other modifications, achieving up to 97% accuracy and increasing the mapping of reads, especially in shorter RNA segments.
View Article and Find Full Text PDF

The vast majority of deeply intronic genomic variants are benign, but some extremely rare or private deep intronic variants lead to exonification of intronic sequence with abnormal transcriptional consequences. Damaging variants of this class are likely underreported as causes of disease for several reasons: Most clinical DNA and RNA testing does not include full intronic sequences; many of these variants lie in complex repetitive regions that cannot be aligned from short-read whole-genome sequence; and, until recently, consequences of deep intronic variants were not accurately predicted by in silico tools. We evaluated the frequency and consequences of rare deep intronic variants for families severely affected with breast, ovarian, pancreatic, and/or metastatic prostate cancer, but with no causal variant identified by any previous genomic or cDNA-based approach.

View Article and Find Full Text PDF

Here, we present a method for enrichment of double-stranded cfDNA with an average length of ∼40 bp from cfDNA for high-throughput DNA sequencing. This class of cfDNA is enriched at gene promoters and binding sites of transcription factors or structural DNA-binding proteins, so that a genome-wide DNA footprint is directly captured from liquid biopsies. In short double-stranded cfDNA from healthy individuals, we find significant enrichment of 203 transcription factor motifs.

View Article and Find Full Text PDF

Maize phenotypes are plastic, determined by the complex interplay of genetics and environmental variables. Uncovering the genes responsible and understanding how their effects change across a large geographic region are challenging. In this study, we conducted systematic analysis to identify environmental indices that strongly influence 19 traits (including flowering time, plant architecture, and yield component traits) measured in the maize nested association mapping (NAM) population grown in 11 environments.

View Article and Find Full Text PDF

Mutations in splicing factor 3B subunit 1 () frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDSs). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on mutation effects on RNA splicing and its contribution to patient survival and treatment options.

View Article and Find Full Text PDF

Pleiotropy, measured as expression breadth across tissues, is one of the best predictors for protein sequence and expression conservation. In this study, we investigated its effect on the evolution of regulatory elements (CREs). To this end, we carefully reanalyzed the Epigenomics Roadmap data for nine fetal tissues, assigning a measure of pleiotropic degree to nearly half a million CREs.

View Article and Find Full Text PDF