Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures.

Sci Rep

Institute of Clinical Pharmacology, Goethe-University, Theodor - Stern - Kai 7, 60590, Frankfurt am Main, Germany.

Published: January 2020

Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single mode. Gaussian mixtures play an important role in the multimodal distribution of one-dimensional data. However, although fitting of Gaussian mixture models (GMM) is often aimed at obtaining the separate modes composing the mixture, current technical implementations, often using the Expectation Maximization (EM) algorithm, are not optimized for this task. This occasionally results in poorly separated modes that are unsuitable for determining a distinguishable group structure in the data. Here, we introduce "Distribution Optimization" an evolutionary algorithm to GMM fitting that uses an adjustable error function that is based on chi-square statistics and the probability density. The algorithm can be directly targeted at the separation of the modes of the mixture by employing additional criterion for the degree by which single modes overlap. The obtained GMM fits were comparable with those obtained with classical EM based fits, except for data sets where the EM algorithm produced unsatisfactory results with overlapping Gaussian modes. There, the proposed algorithm successfully separated the modes, providing a basis for meaningful group separation while fitting the data satisfactorily. Through its optimization toward mode separation, the evolutionary algorithm proofed particularly suitable basis for group separation in multimodally distributed data, outperforming alternative EM based methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6971287PMC
http://dx.doi.org/10.1038/s41598-020-57432-wDOI Listing

Publication Analysis

Top Keywords

evolutionary algorithm
12
gaussian mixtures
8
one-dimensional data
8
separated modes
8
group separation
8
algorithm
7
data
7
modes
6
distribution optimization
4
optimization evolutionary
4

Similar Publications

This study aims to establish a thyristor-controlled series compensator (TCSC) equipped with a proportional integral derivative with filter (PIDF) controller by using a futuristic optimisation technique called evolutionary programming sine cosine algorithm (EPSCA) with multiobjective function (MOF). EPSCA is developed by merging evolutionary programming and the sine cosine algorithm. Three stability indicators, i.

View Article and Find Full Text PDF

Mastoidectomy is critical in acoustic neuroma surgery, where precise planning of the bone milling area is essential for surgical navigation. The complexity of representing the irregular volumetric area and the presence of high-risk structures (e.g.

View Article and Find Full Text PDF

Fatigue Life Prediction of FRP-Strengthened Reinforced Concrete Beams Based on Soft Computing Techniques.

Materials (Basel)

January 2025

Department of Civil Engineering, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200444, China.

This paper establishes fatigue life prediction models using the soft computing method to address insufficient parameter consideration and limited computational accuracy in predicting the fatigue life of fiber-reinforced polymer (FRP) strengthened concrete beams. Five different input forms were proposed by collecting 117 sets of fatigue test data of FRP-strengthened concrete beams from the existing literature and integrating the outcomes from Pearson correlation analysis and significance testing. Using Gene Expression Programming (GEP), the effects of various input configurations on the accuracy of model predictions were examined.

View Article and Find Full Text PDF

Overview and Prospects of DNA Sequence Visualization.

Int J Mol Sci

January 2025

School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China.

Due to advances in big data technology, deep learning, and knowledge engineering, biological sequence visualization has been extensively explored. In the post-genome era, biological sequence visualization enables the visual representation of both structured and unstructured biological sequence data. However, a universal visualization method for all types of sequences has not been reported.

View Article and Find Full Text PDF

Liver ultrasound segmentation is challenging due to low image quality and variability. While deep learning (DL) models have been widely applied for medical segmentation, generic pre-configured models may not meet the specific requirements for targeted areas in liver ultrasound. Quantitative ultrasound (QUS) is emerging as a promising tool for liver fat measurement; however, accurately segmenting regions of interest within liver ultrasound images remains a challenge.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!