AI Article Synopsis

  • Gene isoforms perform different functions in response to environmental changes, making their study crucial for understanding biological processes and diseases, but identifying them across genomes is technically challenging due to their high sequence similarity.
  • Traditional methods like Sanger sequencing are time-consuming and inefficient, while next-generation sequencing (NGS) technologies face issues with short read lengths that complicate accurate gene assembly.
  • The new analysis pipeline integrating long-read data from the PacBio platform successfully profiles similar gene isoforms and has shown superior performance in reconstructing known isoforms compared to traditional NGS techniques.

Article Abstract

Background: Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming. Next-generation sequencing technologies such as Illumina and 454 achieve high throughput but their short read lengths are a critical barrier to accurate assembly of highly similar gene isoforms, and may result in ambiguities and false joining during sequence assembly. More recently, the third generation sequencer represented by the PacBio platform offers sufficient throughput and long reads covering the full length of typical genes, thus providing a potential to reliably profile gene isoforms. However, the PacBio long reads are error-prone and cannot be effectively analyzed by traditional assembly programs.

Results: We present a clustering-based analysis pipeline integrated with PacBio sequencing data for profiling highly similar gene isoforms. This approach was first evaluated in comparison to de novo assembly of 454 reads using a benchmark admixture containing 10 known, cloned msg genes encoding the major surface glycoprotein of Pneumocystis jirovecii. All 10 msg isoforms were successfully reconstructed with the expected length (~1.5 kb) and correct sequence by the new approach, while 454 reads could not be correctly assembled using various assembly programs. When using an additional benchmark admixture containing 22 known P. jirovecii msg isoforms, this approach accurately reconstructed all but 4 these isoforms in their full-length (~3 kb); these 4 isoforms were present in low concentrations in the admixture. Finally, when applied to the original clinical sample from which the 22 known msg isoforms were cloned, this approach successfully identified not only all known isoforms accurately (~3 kb each) but also 48 novel isoforms.

Conclusions: PacBio sequencing integrated with the clustering-based analysis pipeline achieves high-throughput and high-resolution discrimination of highly similar sequences, and can serve as a new approach for genome-wide characterization of gene isoforms and other highly repetitive sequences.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4820869PMC
http://dx.doi.org/10.1186/s13040-016-0090-8DOI Listing

Publication Analysis

Top Keywords

gene isoforms
32
isoforms
15
highly gene
12
long reads
12
msg isoforms
12
gene
8
clustering-based analysis
8
analysis pipeline
8
pacbio sequencing
8
isoforms approach
8

Similar Publications

RNA-binding motif protein RBM39 enhances the proliferation of gastric cancer cells by facilitating an oncogenic splicing switch in MRPL33.

Acta Pharmacol Sin

January 2025

Jiangsu Key Laboratory of Neuropsychiatric Diseases and College of Pharmaceutical Sciences, The Fourth Affiliated Hospital of Soochow University, Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Suzhou Key Laboratory of Drug Research for Prevention and Treatment of Hyperlipidemic Diseases, Soochow University, Suzhou, 215123, China.

Gastric cancer is a malignant gastrointestinal disease characterized by high morbidity and mortality rates worldwide. The occurrence and progression of gastric cancer are influenced by various factors, including the abnormal alternative splicing of key genes. Recently, RBM39 has emerged as a tumor biomarker that regulates alternative splicing in several types of cancer.

View Article and Find Full Text PDF

Translationally controlled tumor protein (TCTP) is a well conserved and ubiquitously expressed multifunctional protein found in many organisms and is involved in many pathophysiological processes like cell proliferation, differentiation, development and cell death. The role of TCTP in anti-apoptosis and cancer metastasis makes it a promising candidate for cancer therapy. Dictyostelium discoideum, a protist, has two isoforms (TCTP1 and TCTP2, now referred to as TPT1 and TPT2) of which we have earlier elucidated TPT1.

View Article and Find Full Text PDF

Epigenetic processes are the critical events in carcinogenesis. Histone modification plays a crucial role in gene expression regulation, where histone deacetylases (HDACs) are key players in epigenetic processes. Inhibiting HDACs has shown promise in modern cancer therapy.

View Article and Find Full Text PDF

Amyotrophic Lateral Sclerosis(ALS) has traditionally been managed as a neuromuscular disorder. However, recent evidence suggests involvement of non-motor domains. This study aims to evaluate the impact of APOE and MAPT genotypes on the cognitive features of ALS.

View Article and Find Full Text PDF

Background: Heterogeneity in the progression of clinical dementia poses a significant challenge, impeding the effectiveness of current therapies for Alzheimer's disease (AD). To decipher the molecular mechanisms governing heterogeneity in AD progression that remains a critical knowledge gap precluding rational therapeutic design, we investigated the biochemical and biophysical properties of tau present in the inferior temporal gyrus (ITG) and prefrontal cortex (PFC) brain regions of AD patients who had varying disease progression rates. To explore gene expression changes in the ITG which are associated with tau pathology and cognitive decline, we used RNA sequencing for molecular characterization of patients displaying tau and clinical heterogeneity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!