Publications by authors named "Thomas Villmann"

In machine learning, data often comes from different sources, but combining them can introduce extraneous variation that affects both generalization and interpretability. For example, we investigate the classification of neurodegenerative diseases using FDG-PET data collected from multiple neuroimaging centers. However, data collected at different centers introduces unwanted variation due to differences in scanners, scanning protocols, and processing methods.

View Article and Find Full Text PDF

In the field of machine learning, vector quantization is a category of low-complexity approaches that are nonetheless powerful for data representation and clustering or classification tasks. Vector quantization is based on the idea of representing a data or a class distribution using a small set of prototypes, and hence, it belongs to interpretable models in machine learning. Further, the low complexity of vector quantizers makes them interesting for the application of quantum concepts for their implementation.

View Article and Find Full Text PDF

As part of the quality assurance of inpatient treatment, the severity of the disease and the course of therapy must be mapped. However, there is a high degree of heterogeneity in the implementation of basic diagnostics in psychosomatic facilities.There is a lack of scientifically based standardisation in determining the quality of outcomes.

View Article and Find Full Text PDF

The encounter of large amounts of biological sequence data generated during the last decades and the algorithmic and hardware improvements have offered the possibility to apply machine learning techniques in bioinformatics. While the machine learning community is aware of the necessity to rigorously distinguish data transformation from data comparison and adopt reasonable combinations thereof, this awareness is often lacking in the field of comparative sequence analysis. With realization of the disadvantages of alignments for sequence comparison, some typical applications use more and more so-called alignment-free approaches.

View Article and Find Full Text PDF

In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness.

View Article and Find Full Text PDF

Sensor fusion has gained a great deal of attention in recent years. It is used as an application tool in many different fields, especially the semiconductor, automotive, and medical industries. However, this field of research, regardless of the field of application, still presents different challenges concerning the choice of the sensors to be combined and the fusion architecture to be developed.

View Article and Find Full Text PDF

Unlabelled: We present an approach to discriminate SARS-CoV-2 virus types based on their RNA sequence descriptions avoiding a sequence alignment. For that purpose, sequences are preprocessed by feature extraction and the resulting feature vectors are analyzed by prototype-based classification to remain interpretable. In particular, we propose to use variants of learning vector quantization (LVQ) based on dissimilarity measures for RNA sequence data.

View Article and Find Full Text PDF

Motivation: Viruses are the most abundant biological entities and constitute a large reservoir of genetic diversity. In recent years, knowledge about them has increased significantly as a result of dynamic development in life sciences and rapid technological progress. This knowledge is scattered across various data repositories, making a comprehensive analysis of viral data difficult.

View Article and Find Full Text PDF

Introduction: Little is known about peril constellations in primary hemostasis contributing to an acute myocardial infarction (MI) in patients with already manifest atherosclerosis. The study aimed to establish a predicting model based on six biomarkers of primary hemostasis: platelet count, mean platelet volume, hematocrit, soluble glycoprotein VI, fibrinogen and von Willebrand factor ratio.

Materials And Methods: The biomarkers were measured in 1.

View Article and Find Full Text PDF
Article Synopsis
  • Machine learning is essential in life sciences for analyzing large datasets, but many algorithms struggle with interpretability and complexity issues.
  • The Generalized Matrix Learning Vector Quantization (GMLVQ) method offers a prototype-based approach that allows for better visualization and interpretation, making it effective for imbalanced classification problems frequently found in biological data.
  • A study on Early Folding Residues (EFR) using GMLVQ achieved a competitive accuracy of 76.6%, highlighting important biological features and showcasing GMLVQ's ability to clarify complex classifications through visualization.
View Article and Find Full Text PDF

An overview is given of prototype-based models in machine learning. In this framework, observations, i.e.

View Article and Find Full Text PDF

We consider some modifications of the neural gas algorithm. First, fuzzy assignments as known from fuzzy c-means and neighborhood cooperativeness as known from self-organizing maps and neural gas are combined to obtain a basic Fuzzy Neural Gas. Further, a kernel variant and a simulated annealing approach are derived.

View Article and Find Full Text PDF

Prototype based classifiers are effective algorithms in modeling classification problems and have been applied in multiple domains. While many supervised learning algorithms have been successfully extended to kernels to improve the discrimination power by means of the kernel concept, prototype based classifiers are typically still used with Euclidean distance measures. Kernelized variants of prototype based classifiers are currently too complex to be applied for larger data sets.

View Article and Find Full Text PDF

We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data.

View Article and Find Full Text PDF

Supervised and unsupervised vector quantization methods for classification and clustering traditionally use dissimilarities, frequently taken as Euclidean distances. In this article, we investigate the applicability of divergences instead, focusing on online learning. We deduce the mathematical fundamentals for its utilization in gradient-based online vector quantization algorithms.

View Article and Find Full Text PDF

In this paper, we present a regularization technique to extend recently proposed matrix learning schemes in learning vector quantization (LVQ). These learning algorithms extend the concept of adaptive distance measures in LVQ to the use of relevance matrices. In general, metric learning can display a tendency towards oversimplification in the course of training.

View Article and Find Full Text PDF

The authors developed a concept that applies self-organization theory to psychodynamic principles. According to this concept, episodes of temporary destabilization represent a precondition for abrupt changes within the therapeutic process. The authors examined six courses of therapy (patients diagnosed with depression and personality disorder).

View Article and Find Full Text PDF

Objectives: Fine motor skills disorders belong to the neurological manifestation of Wilson's disease. The aim of this study is to investigate if fine motor performance changes during the course of the disease and with therapy.

Methods: In 15 neurological patients with Wilson's disease, severity of neurological symptoms was assessed with a neurology score.

View Article and Find Full Text PDF

Objective: Living organ donation involves interference with a healthy organism. Therefore, most transplantation centres ascertain the voluntariness of the donation as well as its motivation by means of a psychosomatic evaluation. The circumstance that the evaluation is compulsory and not a primary concern of the donor-recipient pair may occasion respondents to present only what they consider innocuous and socially adequate.

View Article and Find Full Text PDF

Objective: Mass spectrometry has become a standard technique to analyze clinical samples in cancer research. The obtained spectrometric measurements reveal a lot of information of the clinical sample at the peptide and protein level. The spectra are high dimensional and, due to the small number of samples a sparse coverage of the population is very common.

View Article and Find Full Text PDF

The objective of this study is to determine and to analyze so-called key sessions in the frameworks of Therapeutic Cycles Model introduced by Mergenthaler and the Energy Model proposed by Caspar. For this purpose, different measures for key session identification are used based on linguistic text variables. The investigation is done for 10 high-frequency, psychodynamic, inpatient, individual therapies consisting of overall 206 therapeutic sessions, all of which were completely videotaped and transcribed.

View Article and Find Full Text PDF

In the present contribution we propose two recently developed classification algorithms for the analysis of mass-spectrometric data-the supervised neural gas and the fuzzy-labeled self-organizing map. The algorithms are inherently regularizing, which is recommended, for these spectral data because of its high dimensionality and the sparseness for specific problems. The algorithms are both prototype-based such that the principle of characteristic representants is realized.

View Article and Find Full Text PDF

In this paper, we examine the scope of validity of the explicit self-organizing map (SOM) magnification control scheme of Bauer et al. (1996) on data for which the theory does not guarantee success, namely data that are n-dimensional, n > or =2, and whose components in the different dimensions are not statistically independent. The Bauer et al.

View Article and Find Full Text PDF

Neural Gas (NG) constitutes a very robust clustering algorithm given Euclidean data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced.

View Article and Find Full Text PDF

We consider different ways to control the magnification in self-organizing maps (SOM) and neural gas (NG). Starting from early approaches of magnification control in vector quantization, we then concentrate on different approaches for SOM and NG. We show that three structurally similar approaches can be applied to both algorithms that are localized learning, concave-convex learning, and winner-relaxing learning.

View Article and Find Full Text PDF