Genotype copy number variations using Gaussian mixture models: theory and algorithms.

Stat Appl Genet Mol Biol

Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taiwan.

Published: October 2012

Copy number variations (CNVs) are important in the disease association studies and are usually targeted by most recent microarray platforms developed for GWAS studies. However, the probes targeting the same CNV regions could vary greatly in performance, with some of the probes carrying little information more than pure noise. In this paper, we investigate how to best combine measurements of multiple probes to estimate copy numbers of individuals under the framework of Gaussian mixture model (GMM). First we show that under two regularity conditions and assume all the parameters except the mixing proportions are known, optimal weights can be obtained so that the univariate GMM based on the weighted average gives the exactly the same classification as the multivariate GMM does. We then developed an algorithm that iteratively estimates the parameters and obtains the optimal weights, and uses them for classification. The algorithm performs well on simulation data and two sets of real data, which shows clear advantage over classification based on the equal weighted average.

Download full-text PDF

Source
http://dx.doi.org/10.1515/1544-6115.1725DOI Listing

Publication Analysis

Top Keywords

copy number
8
number variations
8
gaussian mixture
8
optimal weights
8
weighted average
8
genotype copy
4
variations gaussian
4
mixture models
4
models theory
4
theory algorithms
4

Similar Publications

Quantitative Analysis of Hepatitis D Virus Using gRNA-Sensitive Semiconducting Polymer Dots.

Anal Chem

January 2025

State Key Laboratory of Integrated Optoelectronics, College of Electronics Science and Engineering, Jilin University, No. 2699 Qianjin Street, Changchun, Jilin 130012, P. R. China.

Hepatitis D virus (HDV) significantly influences the progression of liver diseases. Through clinical observations and database analyses, it has been established that patients coinfected with HDV and hepatitis B virus (HBV) experience accelerated progression toward cirrhosis, hepatocellular carcinoma (HCC), and liver failure compared to those infected solely with HBV. A higher viral load correlates with increased replicative activity, enhanced infectivity, and more severe disease manifestations.

View Article and Find Full Text PDF

Comparative analysis of regression algorithms for drug response prediction using GDSC dataset.

BMC Res Notes

January 2025

Department of Computer Engineering, Chungbuk National University, Chungdae-ro 1, Cheongju, 28644, Republic of Korea.

Background: Drug response prediction can infer the relationship between an individual's genetic profile and a drug, which can be used to determine the choice of treatment for an individual patient. Prediction of drug response is recently being performed using machine learning technology. However, high-throughput sequencing data produces thousands of features per patient.

View Article and Find Full Text PDF

Background: Acute myeloid leukemia (AML) is an aggressive hematological neoplasm. Little improvement in survival rates has been achieved over the past few decades. Necroptosis has relationship with certain types of malignancies outcomes.

View Article and Find Full Text PDF

Unlabelled: This study aimed to synthesize evidence from primary studies on the acceptability and effectiveness of mindfulness-based interventions (MBIs) for improving lifestyle behaviors and body mass index (BMI) in children with overweight or obesity. We conducted a meta-analysis or followed the Synthesis Without Meta-analysis (SWiM) guidelines to synthesize study findings. The analysis included both mindfulness-only interventions and comprehensive behavioral interventions incorporating mindfulness components.

View Article and Find Full Text PDF

With the increasing availability of high-quality genome assemblies, pangenome graphs emerged as a new paradigm in the genomics field for identifying, encoding, and presenting genomic variation at both population and species levels. However, it remains challenging to truly dissect and interpret pangenome graphs via biologically informative visualization. To facilitate better exploration and understanding of pangenome graphs towards novel biological insights, here we present a web-based interactive Visualization and interpretation framework for linear-Reference-projected Pangenome Graphs (VRPG).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!