Enhancing site selection strategies in clinical trial recruitment using real-world data modeling.

PLoS One

R&D Data Science & Digital Health, Janssen Pharmaceutica NV, Beerse, Antwerp, Belgium.

Published: March 2024

Slow patient enrollment or failing to enroll the required number of patients is a disruptor of clinical trial timelines. To meet the planned trial recruitment, site selection strategies are used during clinical trial planning to identify research sites that are most likely to recruit a sufficiently high number of subjects within trial timelines. We developed a machine learning approach that outperforms baseline methods to rank research sites based on their expected recruitment in future studies. Indication level historical recruitment and real-world data are used in the machine learning approach to predict patient enrollment at site level. We define covariates based on published recruitment hypotheses and examine the effect of these covariates in predicting patient enrollment. We compare model performance of a linear and a non-linear machine learning model with common industry baselines that are constructed from historical recruitment data. Performance of the methodology is evaluated and reported for two disease indications, inflammatory bowel disease and multiple myeloma, both of which are actively being pursued in clinical development. We validate recruitment hypotheses by reviewing the covariates relationship with patient recruitment. For both indications, the non-linear model significantly outperforms the baselines and the linear model on the test set. In this paper, we present a machine learning approach to site selection that incorporates site-level recruitment and real-world patient data. The model ranks research sites by predicting the number of recruited patients and our results suggest that the model can improve site ranking compared to common industry baselines.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10927105PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0300109PLOS

Publication Analysis

Top Keywords

machine learning
16
site selection
12
clinical trial
12
recruitment real-world
12
patient enrollment
12
learning approach
12
recruitment
9
selection strategies
8
strategies clinical
8
trial recruitment
8

Similar Publications

The feasibility of using machine learning to predict COVID-19 cases.

Int J Med Inform

January 2025

School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, United Kingdom. Electronic address:

Background: Coronavirus Disease 2019 (COVID-19), caused by the SARS-CoV-2 virus, emerged as a global health crisis in 2019, resulting in widespread morbidity and mortality. A persistent challenge during the pandemic has been the accuracy of reported epidemic data, particularly in underdeveloped regions with limited access to COVID-19 test kits and healthcare infrastructure. In the post-COVID era, this issue remains crucial.

View Article and Find Full Text PDF

Identification of an ANCA-associated vasculitis cohort using deep learning and electronic health records.

Int J Med Inform

January 2025

Rheumatology and Allergy Clinical Epidemiology Research Center and Division of Rheumatology, Allergy, and Immunology, and Mongan Institute, Department of Medicine, Massachusetts General Hospital Boston MA USA. Electronic address:

Background: ANCA-associated vasculitis (AAV) is a rare but serious disease. Traditional case-identification methods using claims data can be time-intensive and may miss important subgroups. We hypothesized that a deep learning model analyzing electronic health records (EHR) can more accurately identify AAV cases.

View Article and Find Full Text PDF

Background: The application of natural language processing in medicine has increased significantly, including tasks such as information extraction and classification. Natural language processing plays a crucial role in structuring free-form radiology reports, facilitating the interpretation of textual content, and enhancing data utility through clustering techniques. Clustering allows for the identification of similar lesions and disease patterns across a broad dataset, making it useful for aggregating information and discovering new insights in medical imaging.

View Article and Find Full Text PDF

Diagnosis of lung cancer using salivary miRNAs expression and clinical characteristics.

BMC Pulm Med

January 2025

Universal Scientific Education and Research Network (USERN), Tehran, Iran.

Objective: Lung cancer (LC), the primary cause for cancer-related death globally is a diverse illness with various characteristics. Saliva is a readily available biofluid and a rich source of miRNA. It can be collected non-invasively as well as transported and stored easily.

View Article and Find Full Text PDF

Background: Drug-drug interactions (DDIs) especially antagonistic ones present significant risks to patient safety, underscoring the urgent need for reliable prediction methods. Recently, substructure-based DDI prediction has garnered much attention due to the dominant influence of functional groups and substructures on drug properties. However, existing approaches face challenges regarding the insufficient interpretability of identified substructures and the isolation of chemical substructures.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!