An integrative probabilistic model for identification of structural variation in sequencing data.

Suzanne S Sindi Selim Onal Luke C Peng Hsin-Ta Wu Benjamin J Raphael

Genome Biol

Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA.

Published: September 2012

Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This results in reduced sensitivity to detect SVs, especially in repetitive regions. We introduce GASVPro, an algorithm combining both paired read and read depth signals into a probabilistic model which can analyze multiple alignments of reads. GASVPro outperforms existing methods with a 50-90% improvement in specificity on deletions and a 50% improvement on inversions.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439973	PMC
http://dx.doi.org/10.1186/gb-2012-13-3-r22	DOI Listing

Publication Analysis

Top Keywords

probabilistic model

structural variation

multiple alignments

integrative probabilistic

model identification

identification structural

variation sequencing

sequencing data

data paired-end

paired-end sequencing

Similar Publications

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

Sci Rep

January 2025

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

Arkaprava Banerjee Kunal Roy

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.

View Article and Find Full Text PDF

Similar Publications

Climate change impact on water treatment plants: analysis of chlorophyll-a levels and process performance.

Environ Sci Pollut Res Int

January 2025

Department of Environmental Health Engineering, School of Public Health, Mazandaran University of Medical Sciences, Sari, Iran.

Saeed Motesaddi Anoushirvan Mohseni-Bandpei Mohsen Nasseri Mohammad Rafiee Yalda Hashempour

Climate change significantly impacts the risk of eutrophication and, consequently, chlorophyll-a (Chl-a) concentrations. Understanding the impact of water flows is a crucial first step in developing insights into future patterns of change and associated risks. In this study, the Statistical DownScaling Model (SDSM)-a widely used daily downscaling method-is implemented to produce downscaled local climate variables, which serve as input for simulating future hydro-climate conditions using a hydrological model.

View Article and Find Full Text PDF

Similar Publications

Segmentation aware probabilistic phenotyping of single-cell spatial protein expression data.

Nat Commun

January 2025

Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.

Yuju Lee Edward L Y Chen Darren C H Chan Anuroopa Dinesh Somaieh Afiuni-Zadeh

Spatial protein expression technologies can map cellular content and organization by simultaneously quantifying the expression of >40 proteins at subcellular resolution within intact tissue sections and cell lines. However, necessary image segmentation to single cells is challenging and error prone, easily confounding the interpretation of cellular phenotypes and cell clusters. To address these limitations, we present STARLING, a probabilistic machine learning model designed to quantify cell populations from spatial protein expression data while accounting for segmentation errors.

View Article and Find Full Text PDF

Similar Publications

Machine learning models predicting risk of revision or secondary knee injury after anterior cruciate ligament reconstruction demonstrate variable discriminatory and accuracy performance: a systematic review.

BMC Musculoskelet Disord

January 2025

Division of Orthopaedic Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada.

Benjamin Blackman Prushoth Vivekanantha Rafay Mughal Ayoosh Pareek Anthony Bozzo

Background: To summarize the statistical performance of machine learning in predicting revision, secondary knee injury, or reoperations following anterior cruciate ligament reconstruction (ACLR), and to provide a general overview of the statistical performance of these models.

Methods: Three online databases (PubMed, MEDLINE, EMBASE) were searched from database inception to February 6, 2024, to identify literature on the use of machine learning to predict revision, secondary knee injury (e.g.

View Article and Find Full Text PDF

Similar Publications

Economic impact of prolonged tracheal extubation times on operating room time overall and for subgroups of surgeons: a historical cohort study.

BMC Anesthesiol

January 2025

Department of Anesthesiology, Perioperative Medicine and Pain Management, 1611 NW 12, University of Miami, Miami, FL, 33136, USA.

Franklin Dexter Anil A Marian Richard H Epstein

Background: Prolonged tracheal extubation time is defined as an interval ≥ 15 min from the end of surgery to extubation. An earlier study showed that prolonged extubations had a mean 12.4 min longer time from the end of surgery to operating room (OR) exit.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!