Publications by authors named "Dickhaus T"

P-values that are derived from continuously distributed test statistics are typically uniformly distributed on (0,1) under least favorable parameter configurations (LFCs) in the null hypothesis. Conservativeness of a p-value P (meaning that P is under the null hypothesis stochastically larger than uniform on (0,1)) can occur if the test statistic from which P is derived is discrete, or if the true parameter value under the null is not an LFC. To deal with both of these sources of conservativeness, we present two approaches utilizing randomized p-values.

View Article and Find Full Text PDF

Modern high-throughput biomedical devices routinely produce data on a large scale, and the analysis of high-dimensional datasets has become commonplace in biomedical studies. However, given thousands or tens of thousands of measured variables in these datasets, extracting meaningful features poses a challenge. In this article, we propose a procedure to evaluate the strength of the associations between a nominal (categorical) response variable and multiple features simultaneously.

View Article and Find Full Text PDF

We present a new approach to modeling the future development of extreme temperatures globally and on the time-scale of several centuries by using non-stationary generalized extreme value distributions in combination with logistic functions. The statistical models we propose are applied to annual maxima of daily temperature data from fully coupled climate models spanning the years 1850 through 2300. They enable us to investigate how extremes will change depending on the geographic location not only in terms of the magnitude, but also in terms of the timing of the changes.

View Article and Find Full Text PDF

Large-scale hypothesis testing has become a ubiquitous problem in high-dimensional statistical inference, with broad applications in various scientific disciplines. One relevant application is constituted by imaging mass spectrometry (IMS) association studies, where a large number of tests are performed simultaneously in order to identify molecular masses that are associated with a particular phenotype, for example, a cancer subtype. Mass spectra obtained from matrix-assisted laser desorption/ionization (MALDI) experiments are dependent, when considered as statistical quantities.

View Article and Find Full Text PDF

We are concerned with testing replicability hypotheses for many endpoints simultaneously. This constitutes a multiple test problem with composite null hypotheses. Traditional -values, which are computed under least favorable parameter configurations (LFCs), are over-conservative in the case of composite null hypotheses.

View Article and Find Full Text PDF

Multivariate multiple test procedures have received growing attention recently. This is due to the fact that data generated by modern applications typically are high-dimensional, but possess pronounced dependencies due to the technical mechanisms involved in the experiments. Hence, it is possible and often necessary to exploit these dependencies in order to achieve reasonable power.

View Article and Find Full Text PDF

A method was developed to quantify the performance of microorganisms involved in different digestion levels in biogas plants. The test system was based on the addition of butyrate (BCON), ethanol (ECON), acetate (ACON) or propionate (PCON) to biogas sludge samples and the subsequent analysis of CH formation in comparison to control samples. The combination of the four values was referred to as BEAP profile.

View Article and Find Full Text PDF

The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction.

View Article and Find Full Text PDF

Signal detection in functional magnetic resonance imaging (fMRI) inherently involves the problem of testing a large number of hypotheses. A popular strategy to address this multiplicity is the control of the false discovery rate (FDR). In this work we consider the case where prior knowledge is available to partition the set of all hypotheses into disjoint subsets or families, e.

View Article and Find Full Text PDF

We are concerned with statistical inference for 2 × C × K contingency tables in the context of genetic case-control association studies. Multivariate methods based on asymptotic Gaussianity of vectors of test statistics require information about the asymptotic correlation structure among these test statistics under the global null hypothesis. In the case of C=2, we show that for a wide variety of test statistics this asymptotic correlation structure is given by the standardized linkage disequilibrium matrix of the K loci under investigation.

View Article and Find Full Text PDF

Motivation: When analyzing a case group of patients with ultra-rare disorders the ethnicities are often diverse and the data quality might vary. The population substructure in the case group as well as the heterogeneous data quality can cause substantial inflation of test statistics and result in spurious associations in case-control studies if not properly adjusted for. Existing techniques to correct for confounding effects were especially developed for common variants and are not applicable to rare variants.

View Article and Find Full Text PDF

Genetic association studies lead to simultaneous categorical data analysis. The sample for every genetic locus consists of a contingency table containing the numbers of observed genotype-phenotype combinations. Under case-control design, the row counts of every table are identical and fixed, while column counts are random.

View Article and Find Full Text PDF

Epigenetic research leads to complex data structures. Since parametric model assumptions for the distribution of epigenetic data are hard to verify we introduce in the present work a nonparametric statistical framework for two-group comparisons. Furthermore, epigenetic analyses are often performed at various genetic loci simultaneously.

View Article and Find Full Text PDF

Objective: The aim of this longitudinal study was to identify predictors of instantaneous well-being in patients with amyotrophic lateral sclerosis (ALS). Based on flow theory well-being was expected to be highest when perceived demands and perceived control were in balance, and that thinking about the past would be a risk factor for rumination which would in turn reduce well-being.

Methods: Using the experience sampling method, data on current activities, associated aspects of perceived demands, control, and well-being were collected from 10 patients with ALS three times a day for two weeks.

View Article and Find Full Text PDF

The adaptive immune system is involved in tumor establishment and aggressiveness. Tumors of the ovaries, an immune-privileged organ, spread via transceolomic routes and rarely to distant organs. This is contrary to tumors of non-immune privileged organs, which often disseminate hematogenously to distant organs.

View Article and Find Full Text PDF

With exome sequencing becoming a tool for mutation detection in routine diagnostics there is an increasing need for platform-independent methods of quality control. We present a genotype-weighted metric that allows comparison of all the variant calls of an exome to a high-quality reference dataset of an ethnically matched population. The exome-wide genotyping accuracy is estimated from the distance to this reference set, and does not require any further knowledge about data generation or the bioinformatics involved.

View Article and Find Full Text PDF

Objective: In brain-computer interface (BCI) research, systems based on event-related potentials (ERP) are considered particularly successful and robust. This stems in part from the repeated stimulation which counteracts the low signal-to-noise ratio in electroencephalograms. Repeated stimulation leads to an optimization problem, as more repetitions also cost more time.

View Article and Find Full Text PDF

Fatty acids, uric acid and glucose are thought to contribute to subclinical inflammation associated with diabetes mellitus. We tested whether co-incubation of free fatty acids and uric acid or glucose influences the secretion of immune mediators from stimulated human whole blood in vitro. Fresh whole blood samples from 20 healthy subjects, 20 patients with type 1 diabetes and 23 patients with type 2 diabetes were incubated for 24 h with palmitic acid (PAL), linolenic acid (LIN) or eicosapentaenoic acid (EPA) alone or together with elevated concentrations of uric acid or glucose.

View Article and Find Full Text PDF

Connecting multiple testing with binary classification, we derive a false discovery rate-based classification approach for two-class mixture models, where the available data (represented as feature vectors) for each individual comparison take values in Rd for some d≥1 and may exhibit certain forms of autocorrelation. This generalizes previous findings for the independent case in dimension d=1. Two resulting classification procedures are described which allow for incorporating prior knowledge about class probabilities and for user-supplied weighting of the severity of misclassifying a member of the "0"-class as "1" and vice versa.

View Article and Find Full Text PDF

We study exact tests for (2 x 2) and (2 x 3) contingency tables, in particular exact chi-squared tests and exact tests of Fisher type. In practice, these tests are typically carried out without randomization, leading to reproducible results but not exhausting the significance level. We discuss that this can lead to methodological and practical issues in a multiple testing framework when many tables are simultaneously under consideration as in genetic association studies.

View Article and Find Full Text PDF

With the availability of next-generation sequencing (NGS) technology, it is expected that sequence variants may be called on a genomic scale. Here, we demonstrate that a deeper understanding of the distribution of the variant call frequencies at heterozygous loci in NGS data sets is a prerequisite for sensitive variant detection. We model the crucial steps in an NGS protocol as a stochastic branching process and derive a mathematical framework for the expected distribution of alleles at heterozygous loci before measurement that is sequencing.

View Article and Find Full Text PDF

Background: After about 30 years of research on Brain-Computer Interfaces (BCIs) there is little knowledge about the phenomenon, that some people - healthy as well as individuals with disease - are not able to learn BCI-control. To elucidate this "BCI-inefficiency" phenomenon, the current study investigated whether psychological parameters, such as attention span, personality or motivation, could predict performance in a single session with a BCI controlled by modulation of sensorimotor rhythms (SMR) with motor imagery.

Methods: A total of N=83 healthy BCI novices took part in the session.

View Article and Find Full Text PDF

Machine learning and pattern recognition algorithms have in the past years developed to become a working horse in brain imaging and the computational neurosciences, as they are instrumental for mining vast amounts of neural data of ever increasing measurement precision and detecting minuscule signals from an overwhelming noise floor. They provide the means to decode and characterize task relevant brain states and to distinguish them from non-informative brain signals. While undoubtedly this machinery has helped to gain novel biological insights, it also holds the danger of potential unintentional abuse.

View Article and Find Full Text PDF

We localize the sources of class-dependent event-related desynchronisation (ERD) of the mu-rhythm related to different types of motor imagery in Brain-Computer Interfacing (BCI) sessions. Our approach is based on localization of single-trial Fourier coefficients using sparse basis field expansions (S-FLEX). The analysis reveals focal sources in the sensorimotor cortices, a finding which can be regarded as a proof for the expected neurophysiological origin of the BCI control signal.

View Article and Find Full Text PDF