Gene fusions created by somatic genomic rearrangements are known to play an important role in the onset and development of some cancers, such as lymphomas and sarcomas. RNA-Seq (whole transcriptome shotgun sequencing) is proving to be a useful tool for the discovery of novel gene fusions in cancer transcriptomes. However, algorithmic methods for the discovery of gene fusions using RNA-Seq data remain underdeveloped. We have developed deFuse, a novel computational method for fusion discovery in tumor RNA-Seq data. Unlike existing methods that use only unique best-hit alignments and consider only fusion boundaries at the ends of known exons, deFuse considers all alignments and all possible locations for fusion boundaries. As a result, deFuse is able to identify fusion sequences with demonstrably better sensitivity than previous approaches. To increase the specificity of our approach, we curated a list of 60 true positive and 61 true negative fusion sequences (as confirmed by RT-PCR), and have trained an adaboost classifier on 11 novel features of the sequence data. The resulting classifier has an estimated value of 0.91 for the area under the ROC curve. We have used deFuse to discover gene fusions in 40 ovarian tumor samples, one ovarian cancer cell line, and three sarcoma samples. We report herein the first gene fusions discovered in ovarian cancer. We conclude that gene fusions are not infrequent events in ovarian cancer and that these events have the potential to substantially alter the expression patterns of the genes involved; gene fusions should therefore be considered in efforts to comprehensively characterize the mutational profiles of ovarian cancer transcriptomes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098195PMC
http://dx.doi.org/10.1371/journal.pcbi.1001138DOI Listing

Publication Analysis

Top Keywords

gene fusions
28
ovarian cancer
16
rna-seq data
12
gene
8
fusion discovery
8
discovery tumor
8
tumor rna-seq
8
cancer transcriptomes
8
fusion boundaries
8
fusion sequences
8

Similar Publications

First Report of Influenza D Virus in Dairy Cattle in Pakistan.

Viruses

November 2024

Department of Environmental and Global Health College of Public Health and Health Professions, University of Florida, Gainesville, FL 32610, USA.

Influenza D virus (IDV) is a newly emerged zoonotic virus increasingly reported worldwide. Cattle are considered the main reservoir of IDV, although it was first isolated from pigs. IDV infects multiple animal species and contributes to the bovine respiratory disease complex (BRDC).

View Article and Find Full Text PDF

Characterization and Transcriptional Regulation of the 2-Ketogluconate Utilization Operon in .

Microorganisms

December 2024

School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China.

JUIM01 is an industrial 2-keto-d-gluconate (2KGA)-producing strain. However, its regulation mechanism of 2KGA metabolism remains to be clarified. Among other reported species, the 2-ketogluconate utilization operon ( operon) plays key roles in 2KGA catabolism.

View Article and Find Full Text PDF

Small regulatory RNAs (sRNA) have been shown to play a large role in the management of stress responses in and other bacteria. Upon fluctuations in nutrient availability and exposure to antimicrobials and superoxide-generating agents, the MicF sRNA in has been shown to regulate a small set of genes involved in the management of membrane permeability. Currently, it is unknown whether MicF acts on other processes to mediate the response to these agents.

View Article and Find Full Text PDF

Retrotransposon Gag-like 4 (), a gene acquired from a retrovirus, is a causative gene in autism spectrum disorder. Its knockout mice exhibit increased impulsivity, impaired short-term spatial memory, failure to adapt to novel environments, and delayed noradrenaline (NA) recovery in the frontal cortex. However, due to its very low expression in the brain, it remains unknown which brain cells express RTL4 and its dynamics in relation to NA.

View Article and Find Full Text PDF

LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy.

Int J Mol Sci

December 2024

School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China.

Long non-coding RNA (lncRNA) is a non-coding RNA longer than 200 nucleotides, crucial for functions like cell cycle regulation and gene transcription. Accurate localization prediction from sequence information is vital for understanding lncRNA's biological roles. Computational methods offer an effective alternative to traditional experimental methods for annotating lncRNA subcellular positions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!