Background: Expressed sequences (e.g. ESTs) are a strong source of evidence to improve gene structures and predict reliable alternative splicing events. When a genome assembly is available, ESTs are suitable to generate gene-oriented clusters through the well-established EasyCluster software. Nowadays, EST-like sequences can be massively produced using Next Generation Sequencing (NGS) technologies. In order to handle genome-scale transcriptome data, we present here EasyCluster2, a reimplementation of EasyCluster able to speed up the creation of gene-oriented clusters and facilitate downstream analyses as the assembly of full-length transcripts and the detection of splicing isoforms.

Results: EasyCluster2 has been developed to facilitate the genome-based clustering of EST-like sequences generated through the NGS 454 technology. Reads mapped onto the reference genome can be uploaded using the standard GFF3 file format. Alignment parsing is initially performed to produce a first collection of pseudo-clusters by grouping reads according to the overlap of their genomic coordinates on the same strand. EasyCluster2 then refines read grouping by including in each cluster only reads sharing at least one splice site and optionally performs a Smith-Waterman alignment in the region surrounding splice sites in order to correct for potential alignment errors. In addition, EasyCluster2 can include unspliced reads, which generally account for >50% of 454 datasets, and collapses overlapping clusters. Finally, EasyCluster2 can assemble full-length transcripts using a Directed-Acyclic-Graph-based strategy, simplifying the identification of alternative splicing isoforms, thanks also to the implementation of the widespread AStalavista methodology. Accuracy and performances have been tested on real as well as simulated datasets.

Conclusions: EasyCluster2 represents a unique tool to cluster and assemble transcriptome reads produced with 454 technology, as well as ESTs and full-length transcripts. The clustering procedure is enhanced with the employment of genome annotations and unspliced reads. Overall, EasyCluster2 is able to perform an effective detection of splicing isoforms, since it can refine exon-exon junctions and explore alternative splicing without known reference transcripts. Results in GFF3 format can be browsed in the UCSC Genome Browser. Therefore, EasyCluster2 is a powerful tool to generate reliable clusters for gene expression studies, facilitating the analysis also to researchers not skilled in bioinformatics.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271567PMC
http://dx.doi.org/10.1186/1471-2105-15-S15-S7DOI Listing

Publication Analysis

Top Keywords

alternative splicing
12
full-length transcripts
12
easycluster2
9
transcriptome reads
8
gene-oriented clusters
8
est-like sequences
8
detection splicing
8
454 technology
8
unspliced reads
8
splicing isoforms
8

Similar Publications

NUMB alternative splicing and isoform specific functions in development and disease.

J Biol Chem

January 2025

The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, 555 University Avenue, Toronto, ON, Canada, M5G 1X8; Department of Medical Biophysics, University of Toronto, 610 University Avenue, Toronto, ON, Canada, M5G 2M9. Electronic address:

The NUMB gene encodes a conserved adaptor protein with roles in asymmetric cell division and cell fate determination. First described as an inhibitor of Notch signaling, multi-functional NUMB proteins regulate multiple cellular pathways through protein complexes with ubiquitin ligases, polarity proteins and the endocytic machinery. The vertebrate NUMB protein isoforms were identified over two decades ago, yet the majority of functional studies exploring NUMB function in endocytosis, cell migration and adhesion, development and disease have largely neglected the potential for distinct isoform activity in design and interpretation.

View Article and Find Full Text PDF

Objectives: Many cancer cells depend on exogenous methionine for proliferation, whereas non-tumorigenic cells can divide in media supplemented with the metabolic precursor homocysteine. This phenomenon is known as methionine dependence of cancer or methionine addiction. The underlying mechanisms driving this cancer-specific metabolic addiction are poorly understood.

View Article and Find Full Text PDF

A Series of Novel Alleles of Modulating Heading and Salt Tolerance in Rice.

Plants (Basel)

January 2025

State Key Laboratory of Rice Biology and Breeding, China National Center for Rice Improvement, China National Rice Research Institute, Hangzhou 311400, China.

Rice ( L.) is a staple crop for nearly half of the global population and one of China's most extensively cultivated cereals. Heading date, a critical agronomic trait, determines the regional and seasonal adaptability of rice varieties.

View Article and Find Full Text PDF

Circular RNA Formation and Degradation Are Not Directed by Universal Pathways.

Int J Mol Sci

January 2025

Department of Rare Diseases, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.

Circular RNAs (circRNAs) are a class of unique transcripts characterized by a covalently closed loop structure, which differentiates them from conventional linear RNAs. The formation of circRNAs occurs co-transcriptionally and post-transcriptionally through a distinct type of splicing known as back-splicing, which involves the formation of a head-to-tail splice junction between a 5' splice donor and an upstream 3' splice acceptor. This process, along with exon skipping, intron retention, cryptic splice site utilization, and lariat-driven intron processing, results in the generation of three main types of circRNAs (exonic, intronic, and exonic-intronic) and their isoforms.

View Article and Find Full Text PDF

Impact of Larval Sertraline Exposure on Alternative Splicing in Neural Tissue of Adult .

Int J Mol Sci

January 2025

Immunology Laboratory (UMF), Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Los Barrios No. 1, Los Reyes Iztacala, Tlalnepantla 54090, Mexico.

Sertraline, a selective serotonin reuptake inhibitor (SSRI), is commonly used to treat various psychiatric disorders such as depression and anxiety due to its ability to increase serotonin availability in the brain. Recent findings suggest that sertraline may also influence the expression of genes related to synaptic plasticity and neuronal signaling pathways. Alternative splicing, a process that allows a single gene to produce multiple protein isoforms, plays a crucial role in the regulation of neuronal functions and plasticity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!