Publications by authors named "Alfred Ultsch"

Article Synopsis
  • Small sample sizes in biomedical research lead to poor reproducibility and difficulties in applying findings clinically, caused by factors like limited resources and ethical concerns.
  • A new generative algorithm leveraging self-organizing maps (SOMs) has been developed to effectively increase sample sizes by identifying structures in small datasets and generating new data points without altering the original data's integrity.
  • Experiments show that the generated data maintains the same structure as the original, aiding research in rare diseases like leukemia and arthritis, and the method is available through the R library "Umatrix."
View Article and Find Full Text PDF

Fold change is a common metric in biomedical research for quantifying group differences in omics variables. However, inconsistent calculation methods and inadequate reporting lead to discrepancies in results. This study evaluated various fold-change calculation methods aiming at a recommendation of a preferred approach.

View Article and Find Full Text PDF

Psoriatic arthritis (PsA) is a chronic inflammatory systemic disease whose activity is often assessed using the Disease Activity Score 28 (DAS28-CRP). The present study was designed to investigate the significance of individual components within the score for PsA activity. A cohort of 80 PsA patients (44 women and 36 men, aged 56.

View Article and Find Full Text PDF

Random walks describe stochastic processes characterized by a sequence of unpredictable changes in a random variable with no correlation to past changes. This report describes the random walk component of a clinical sensory test of olfactory performance. The precise definition of this stochastic process allows the establishment of precise diagnostic cut-offs for the identification of olfactory loss.

View Article and Find Full Text PDF

Selecting the k best features is a common task in machine learning. Typically, a few features have high importance, but many have low importance (right-skewed distribution). This report proposes a numerically precise method to address this skewed feature importance distribution in order to reduce a feature set to the informative minimum of items.

View Article and Find Full Text PDF

Background: The International Prognostic Index (IPI) is applied to predict the outcome of chronic lymphocytic leukemia (CLL) with five prognostic factors, including genetic analysis. We investigated whether multiparameter flow cytometry (MPFC) data of CLL samples could predict the outcome by methods of explainable artificial intelligence (XAI). Further, XAI should explain the results based on distinctive cell populations in MPFC dot plots.

View Article and Find Full Text PDF

Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values.

View Article and Find Full Text PDF

"Big omics data" provoke the challenge of extracting meaningful information with clinical benefit. Here, we propose a two-step approach, an initial unsupervised inspection of the structure of the high dimensional data followed by supervised analysis of gene expression levels, to reconstruct the surface patterns on different subtypes of acute myeloid leukemia (AML). First, Bayesian methodology was used, focusing on surface molecules encoded by cluster of differentiation (CD) genes to assess whether AML is a homogeneous group or segregates into clusters.

View Article and Find Full Text PDF

The collection of increasing amounts of data in health care has become relevant for pain therapy and research. This poses problems for analyses with classical approaches, which is why artificial intelligence (AI) and machine learning (ML) methods are being included into pain research. The current literature on AI and ML in the context of pain research was automatically searched and manually curated.

View Article and Find Full Text PDF
Article Synopsis
  • Minimal residual disease (MRD) detection is critical for predicting survival and relapse in acute myeloid leukemia, and it can be assessed using molecular methods or flow cytometry.
  • An automated method called Cinderella was developed to measure both bone marrow dilution and MRD levels simultaneously, improving accuracy in results.
  • Cinderella utilizes explainable artificial intelligence to analyze complex cell populations in bone marrow and peripheral blood, reducing the chances of false-negative MRD findings and enhancing the reliability of MRD reporting.
View Article and Find Full Text PDF

Three different Flow Cytometry datasets consisting of diagnostic samples of either peripheral blood (pB) or bone marrow (BM) from patients without any sign of bone marrow disease at two different health care centers are provided. In Flow Cytometry, each cell rapidly passes through a laser beam one by one, and two light scatter, and eight surface parameters of more than 100.000 cells are measured per sample of each patient.

View Article and Find Full Text PDF

Background: Data transformations are commonly used in bioinformatics data processing in the context of data projection and clustering. The most used Euclidean metric is not scale invariant and therefore occasionally inappropriate for complex, e.g.

View Article and Find Full Text PDF

Motivation: The size of today's biomedical data sets pushes computer equipment to its limits, even for seemingly standard analysis tasks such as data projection or clustering. Reducing large biomedical data by downsampling is therefore a common early step in data processing, often performed as random uniform class-proportional downsampling. In this report, we hypothesized that this can be optimized to obtain samples that better reflect the entire data set than those obtained using the current standard method.

View Article and Find Full Text PDF

Diminished sense of smell impairs the quality of life but olfactorily disabled people are hardly considered in measures of disability inclusion. We aimed to stratify perceptual characteristics and odors according to the extent to which they are perceived differently with reduced sense of smell, as a possible basis for creating olfactory experiences that are enjoyed in a similar way by subjects with normal or impaired olfactory function. In 146 subjects with normal or reduced olfactory function, perceptual characteristics (edibility, intensity, irritation, temperature, familiarity, hedonics, painfulness) were tested for four sets of 10 different odors each.

View Article and Find Full Text PDF

Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures.

View Article and Find Full Text PDF

One aim of data mining is the identification of interesting structures in data. For better analytical results, the basic properties of an empirical distribution, such as skewness and eventual clipping, i.e.

View Article and Find Full Text PDF

Motivation: Calculating the magnitude of treatment effects or of differences between two groups is a common task in quantitative science. Standard effect size measures based on differences, such as the commonly used Cohen's, fail to capture the treatment-related effects on the data if the effects were not reflected by the central tendency. The present work aims at (i) developing a non-parametric alternative to Cohen's d, which (ii) circumvents some of its numerical limitations and (iii) involves obvious changes in the data that do not affect the group means and are therefore not captured by Cohen's d.

View Article and Find Full Text PDF

The Fundamental Clustering Problems Suite (FCPS) offers a variety of clustering challenges that any algorithm should be able to handle given real-world data. The FCPS consists of datasets with known a priori classifications that are to be reproduced by the algorithm. The datasets are intentionally created to be visualized in two or three dimensions under the hypothesis that objects can be grouped unambiguously by the human eye.

View Article and Find Full Text PDF

Background: Persistent pain extending beyond 6 months after breast cancer surgery when adjuvant therapies have ended is a recognised phenomenon. The evolution of postsurgery pain is therefore of interest for future patient management in terms of possible prognoses for distinct groups of patients to enable better patient information.

Objective(s): An analysis aimed to identify subgroups of patients who share similar time courses of postoperative persistent pain.

View Article and Find Full Text PDF

Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single mode. Gaussian mixtures play an important role in the multimodal distribution of one-dimensional data.

View Article and Find Full Text PDF

Advances in flow cytometry enable the acquisition of large and high-dimensional data sets per patient. Novel computational techniques allow the visualization of structures in these data and, finally, the identification of relevant subgroups. Correct data visualizations and projections from the high-dimensional space to the visualization plane require the correct representation of the structures in the data.

View Article and Find Full Text PDF

Motivation: The genetic architecture of diseases becomes increasingly known. This raises difficulties in picking suitable targets for further research among an increasing number of candidates. Although expression based methods of gene set reduction are applied to laboratory-derived genetic data, the analysis of topical sets of genes gathered from knowledge bases requires a modified approach as no quantitative information about gene expression is available.

View Article and Find Full Text PDF

Based on increasing evidence suggesting that MS pathology involves alterations in bioactive lipid metabolism, the present analysis was aimed at generating a complex serum lipid-biomarker. Using unsupervised machine-learning, implemented as emergent self-organizing maps of neuronal networks, swarm intelligence and Minimum Curvilinear Embedding, a cluster structure was found in the input data space comprising serum concentrations of d = 43 different lipid-markers of various classes. The structure coincided largely with the clinical diagnosis, indicating that the data provide a basis for the creation of a biomarker (classifier).

View Article and Find Full Text PDF

Differential induction therapy of all subtypes of acute myeloid leukemia other than acute promyelocytic leukemia is impeded by the long time required to complete complex and diverse cytogenetic and molecular genetic analyses for risk stratification or targeted treatment decisions. Here, we describe a reliable, rapid and sensitive diagnostic approach that combines karyotyping and mutational screening in a single, integrated, next-generation sequencing assay. Numerical karyotyping was performed by low coverage whole genome sequencing followed by copy number variation analysis using a novel algorithm based on -generated reference karyotypes.

View Article and Find Full Text PDF