Publications by authors named "Malgorzata Bogdan"

Recently there have been tremendous efforts to develop statistical procedures which allow to determine subgroups of patients for which certain treatments are effective. This article focuses on the selection of prognostic and predictive genetic biomarkers based on a relatively large number of candidate Single Nucleotide Polymorphisms (SNPs). We consider models which include prognostic markers as main effects and predictive markers as interaction effects with treatment.

View Article and Find Full Text PDF

Index tracking and hedge fund replication aim at cloning the return time series properties of a given benchmark, by either using only a subset of its original constituents or by a set of risk factors. In this paper, we propose a model that relies on the , called SLOPE, for index tracking and hedge fund replication. We show that SLOPE is capable of not only providing sparsity, but also to form groups among assets depending on their partial correlation with the index or the hedge fund return times series.

View Article and Find Full Text PDF

Ghost quantitative trait loci (QTL) are the false discoveries in QTL mapping, that arise due to the "accumulation" of the polygenic effects, uniformly distributed over the genome. The locations on the chromosome that are strongly correlated with the total of the polygenic effects depend on a specific sample correlation structure determined by the genotypes at all loci. The problem is particularly severe when the same genotypes are used to study multiple QTL, e.

View Article and Find Full Text PDF

Human health is strongly associated with person's lifestyle and levels of physical activity. Therefore, characterization of daily human activity is an important task. Accelerometers have been used to obtain precise measurements of body acceleration.

View Article and Find Full Text PDF

Sorted L-One Penalized Estimation (SLOPE, Bogdan et al., 2013, 2015) is a relatively new convex optimization procedure which allows for adaptive selection of regressors under sparse high dimensional designs. Here we extend the idea of SLOPE to deal with the situation when one aims at selecting whole groups of explanatory variables instead of single regressors.

View Article and Find Full Text PDF

Background: IDeAl (Integrated designs and analysis of small population clinical trials) is an EU funded project developing new statistical design and analysis methodologies for clinical trials in small population groups. Here we provide an overview of IDeAl findings and give recommendations to applied researchers.

Method: The description of the findings is broken down by the nine scientific IDeAl work packages and summarizes results from the project's more than 60 publications to date in peer reviewed journals.

View Article and Find Full Text PDF

In genome-wide association studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand, admixture mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus-specific ancestry). Recently it has been proposed to jointly model genotype and locus-specific ancestry within the framework of single marker tests.

View Article and Find Full Text PDF

With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype.

View Article and Find Full Text PDF

We introduce a new estimator for the vector of coefficients in the linear model = + , where has dimensions with possibly larger than . SLOPE, short for Sorted L-One Penalized Estimation, is the solution to [Formula: see text]where λ ≥ λ ≥ … ≥ λ ≥ 0 and [Formula: see text] are the decreasing absolute values of the entries of . This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical ℓ procedures such as the Lasso.

View Article and Find Full Text PDF

To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly.

View Article and Find Full Text PDF

The problem of locating quantitative trait loci (QTL) for experimental populations can be approached by multiple regression analysis. In this context variable selection using a modification of the Bayesian Information Criterion (mBIC) has been well established in the past. In this article a memetic algorithm (MA) is introduced to find the model which minimizes the selection criterion.

View Article and Find Full Text PDF

We consider the problem of locating multiple interacting quantitative trait loci (QTL) influencing traits measured in counts. In many applications the distribution of the count variable has a spike at zero. Zero-inflated generalized Poisson regression (ZIGPR) allows for an additional probability mass at zero and hence an improvement in the detection of significant loci.

View Article and Find Full Text PDF

The modified version of Bayesian Information Criterion (mBIC) is a relatively simple model selection procedure that can be used when locating multiple interacting quantitative trait loci (QTL). Our earlier work demonstrated the statistical properties of mBIC for situations where the average genetic map interval is at least 5 cM. In this work mBIC is adapted to genome searches based on a dense map and, more importantly, to the situation where consecutive QTL and interactions are located by multiple interval mapping.

View Article and Find Full Text PDF

In previous work, a modified version of the Bayesian information criterion (mBIC) was proposed to locate multiple interacting quantitative trait loci (QTL). Simulation studies and real data analysis demonstrate good properties of the mBIC in situations where the error distribution is approximately normal. However, as with other standard techniques of QTL mapping, the performance of the mBIC strongly deteriorates when the trait distribution is heavy tailed or when the data contain a significant proportion of outliers.

View Article and Find Full Text PDF

A modified version (mBIC) of the Bayesian Information Criterion (BIC) has been previously proposed for backcross designs to locate multiple interacting quantitative trait loci. In this article, we extend the method to intercross designs. We also propose two modifications of the mBIC.

View Article and Find Full Text PDF

The problem of locating multiple interacting quantitative trait loci (QTL) can be addressed as a multiple regression problem, with marker genotypes being the regressor variables. An important and difficult part in fitting such a regression model is the estimation of the QTL number and respective interactions. Among the many model selection criteria that can be used to estimate the number of regressor variables, none are used to estimate the number of interactions.

View Article and Find Full Text PDF

Motivation: Pairwise local sequence alignment is commonly used to search data bases for sequences related to some query sequence. Alignments are obtained using a scoring matrix that takes into account the different frequencies of occurrence of the various types of amino acid substitutions. Software like BLAST provides the user with a set of scoring matrices available to choose from, and in the literature it is sometimes recommended to try several scoring matrices on the sequences of interest.

View Article and Find Full Text PDF