Publications by authors named "Pegah Afshar"

Motivation: Isoform deconvolution is an NP-hard problem. The accuracy of the proposed solutions is far from perfect. At present, it is not known if gene structure and isoform concentration can be uniquely inferred given paired-end reads, and there is no objective method to select the fragment length to improve the number of identifiable genes.

View Article and Find Full Text PDF

Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools.

View Article and Find Full Text PDF

The Bcl-2 protein Bnip3 is crucial for provoking oxidative injury to mitochondria following anthracycline treatment or ischemia-reperfusion injury. Herein, we investigate the effects of the polyphenolic compound ellagic acid (EA) on Bnip3 mediated mitochondrial injury and necrotic cell death in cardiac myocytes. In contrast to vehicle treated cardiomyocytes, Bnip3 was highly enriched in mitochondrial fractions of cardiac myocytes treated with the anthracycline doxorubicin or in cells subjected to hypoxia (HPX).

View Article and Find Full Text PDF

RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensive study analysing a broad spectrum of RNA-seq workflows.

View Article and Find Full Text PDF

Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences.

View Article and Find Full Text PDF

The human superoxide dismutase 1 () gene is responsible for neutralizing supercharged oxygen radicals within the cell. Mutation in SOD1 gene causes amyotrophic lateral sclerosis (ALS). Recent studies have shown involvement of the cerebellum in ALS, although the cerebellar contribution in transgenic mice remains unclear.

View Article and Find Full Text PDF

SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses.

View Article and Find Full Text PDF

We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells.

View Article and Find Full Text PDF

Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs).

View Article and Find Full Text PDF