Gene set analysis, a popular approach for analyzing high-throughput gene expression data, aims to identify sets of genes that show enriched expression patterns between two conditions. In addition to the multitude of methods available for this task, users are typically left with many options when creating the required input and specifying the internal parameters of the chosen method. This flexibility can lead to uncertainty about the "right" choice, further reinforced by a lack of evidence-based guidance.
View Article and Find Full Text PDFSimulation studies are widely used for evaluating the performance of statistical methods in psychology. However, the quality of simulation studies can vary widely in terms of their design, execution, and reporting. In order to assess the quality of typical simulation studies in psychology, we reviewed 321 articles published in in 2021 and 2022, among which 100/321 = 31.
View Article and Find Full Text PDFMetal oxide sensor-based electronic nose (E-Nose) technology provides an easy to use method for breath analysis by detection of volatile organic compound (VOC)-induced changes of electrical conductivity. Resulting signal patterns are then analyzed by machine learning (ML) algorithms. This study aimed to establish breath analysis by E-Nose technology as a diagnostic tool for severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) pneumonia within a multi-analyst experiment.
View Article and Find Full Text PDFBackground: Random forests have become popular for clinical risk prediction modeling. In a case study on predicting ovarian malignancy, we observed training AUCs close to 1. Although this suggests overfitting, performance was competitive on test data.
View Article and Find Full Text PDFWhen different researchers study the same research question using the same dataset they may obtain different and potentially even conflicting results. This is because there is often substantial flexibility in researchers' analytical choices, an issue also referred to as "researcher degrees of freedom". Combined with selective reporting of the smallest p-value or largest effect, researcher degrees of freedom may lead to an increased rate of false positive and overoptimistic results.
View Article and Find Full Text PDFQuantitative bias analysis (QBA) permits assessment of the expected impact of various imperfections of the available data on the results and conclusions of a particular real-world study. This article extends QBA methodology to multivariable time-to-event analyses with right-censored endpoints, possibly including time-varying exposures or covariates. The proposed approach employs data-driven simulations, which preserve important features of the data at hand while offering flexibility in controlling the parameters and assumptions that may affect the results.
View Article and Find Full Text PDFThe TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) statement was published in 2015 to provide the minimum reporting recommendations for studies developing or evaluating the performance of a prediction model. Methodological advances in the field of prediction have since included the widespread use of artificial intelligence (AI) powered by machine learning methods to develop prediction models. An update to the TRIPOD statement is thus needed.
View Article and Find Full Text PDFTuning hyperparameters, such as the regularization parameter in Ridge or Lasso regression, is often aimed at improving the predictive performance of risk prediction models. In this study, various hyperparameter tuning procedures for clinical prediction models were systematically compared and evaluated in low-dimensional data. The focus was on out-of-sample predictive performance (discrimination, calibration, and overall prediction error) of risk prediction models developed using Ridge, Lasso, Elastic Net, or Random Forest.
View Article and Find Full Text PDFBackground And Purpose: There is no randomized evidence comparing whole-brain radiotherapy (WBRT) and stereotactic radiosurgery (SRS) in the treatment of multiple brain metastases. This prospective nonrandomized controlled single arm trial attempts to reduce the gap until prospective randomized controlled trial results are available.
Material And Methods: We included patients with 4-10 brain metastases and ECOG performance status ≤ 2 from all histologies except small-cell lung cancer, germ cell tumors, and lymphoma.
Background: In high-dimensional data (HDD) settings, the number of variables associated with each observation is very large. Prominent examples of HDD in biomedical research include omics data with a large number of variables such as many measurements across the genome, proteome, or metabolome, as well as electronic health records data that have large numbers of variables recorded for each patient. The statistical analysis of such data requires knowledge and experience, sometimes of complex methods adapted to the respective research questions.
View Article and Find Full Text PDFThe constant development of new data analysis methods in many fields of research is accompanied by an increasing awareness that these new methods often perform better in their introductory paper than in subsequent comparison studies conducted by other researchers. We attempt to explain this discrepancy by conducting a systematic experiment that we call "cross-design validation of methods". In the experiment, we select two methods designed for the same data analysis task, reproduce the results shown in each paper, and then reevaluate each method based on the study design (i.
View Article and Find Full Text PDFAlthough new biostatistical methods are published at a very high rate, many of these developments are not trustworthy enough to be adopted by the scientific community. We propose a framework to think about how a piece of methodological work contributes to the evidence base for a method. Similar to the well-known phases of clinical research in drug development, we propose to define four phases of methodological research.
View Article and Find Full Text PDFIn recent years, unsupervised analysis of microbiome data, such as microbial network analysis and clustering, has increased in popularity. Many new statistical and computational methods have been proposed for these tasks. This multiplicity of analysis strategies poses a challenge for researchers, who are often unsure which method(s) to use and might be tempted to try different methods on their dataset to look for the "best" ones.
View Article and Find Full Text PDFInfantile nephropathic cystinosis, due to impaired transport of cystine out of lysosomes, occurs with an incidence of 1 in 100-200,000 live births. It is characterized by renal Fanconi syndrome in the first year of life and glomerular dysfunction progression to end-stage kidney disease by approximately 10 years of age. Treatment with oral cysteamine therapy helps preserve glomerular function, but affected individuals eventually require kidney replacement therapy.
View Article and Find Full Text PDFBackground: Pseudoprogression (PsP) or radiation necrosis (RN) may frequently occur after cranial radiotherapy and show a similar imaging pattern compared with progressive disease (PD). We aimed to evaluate the diagnostic accuracy of magnetic resonance imaging-based contrast clearance analysis (CCA) in this clinical setting.
Patients And Methods: Patients with equivocal imaging findings after cranial radiotherapy were consecutively included into this monocentric prospective study.
Background: A casemix classification based on patients' needs can serve to better describe the patient group in palliative care and thus help to develop adequate future care structures and enable national benchmarking and quality control. However, in Germany, there is no such an evidence-based system to differentiate the complexity of patients' needs in palliative care. Therefore, the study aims to develop a patient-oriented, nationally applicable complexity and casemix classification for adult palliative care patients in Germany.
View Article and Find Full Text PDFFor a given research question, there are usually a large variety of possible analysis strategies acceptable according to the scientific standards of the field, and there are concerns that this multiplicity of analysis strategies plays an important role in the non-replicability of research findings. Here, we define a general framework on common sources of uncertainty arising in computational analyses that lead to this multiplicity, and apply this framework within an overview of approaches proposed across disciplines to address the issue. Armed with this framework, and a set of recommendations derived therefrom, researchers will be able to recognize strategies applicable to their field and use them to generate findings more likely to be replicated in future studies, ultimately improving the credibility of the scientific process.
View Article and Find Full Text PDFIn health research, statistical methods are frequently used to address a wide variety of research questions. For almost every analytical challenge, different methods are available. But how do we choose between different methods and how do we judge whether the chosen method is appropriate for our specific study? Like in any science, in statistics, experiments can be run to find out which methods should be used under which circumstances.
View Article and Find Full Text PDF