The biologics sector has amassed a wealth of data in the past three decades, in line with the bioprocess development and manufacturing guidelines, and analysis of these data with precision is expected to reveal behavioural patterns in cell populations that can be used for making predictions on how future culture processes might behave. The historical bioprocessing data likely comprise experiments conducted using different cell lines, to produce different products and may be years apart; the situation causing inter-batch variability and missing data points to human- and instrument-associated technical oversights. These unavoidable complications necessitate the introduction of a pre-processing step prior to data mining. This study investigated the efficiency of mean imputation and multivariate regression for filling in the missing information in historical bio-manufacturing datasets, and evaluated their performance by symbolic regression models and Bayesian non-parametric models in subsequent data processing. Mean substitution was shown to be a simple and efficient imputation method for relatively smooth, non-dynamical datasets, and regression imputation was effective whilst maintaining the existing standard deviation and shape of the distribution in dynamical datasets with less than 30% missing data. The nature of the missing information, whether Missing Completely At Random, Missing At Random or Missing Not At Random, emerged as the key feature for selecting the imputation method.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6430751PMC
http://dx.doi.org/10.1007/s00449-018-02059-5DOI Listing

Publication Analysis

Top Keywords

missing data
12
missing
8
data
8
imputation method
8
random missing
8
missing random
8
heuristic approach
4
approach handling
4
handling missing
4
data biologics
4

Similar Publications

Background: Calf muscle weakness is a common symptom in slowly progressive neuromuscular disorders that lead to walking problems like instability and increased walking effort. The mainstay of treatment to improve walking in this population is the provision of ankle-foot-orthoses (AFOs). Since we are not aware of an up-to-date and complete overview of the effects of AFOs used for calf muscle weakness in slowly progressive neuromuscular disorders, we reviewed the evidence for the effectiveness of AFOs to improve walking in this patient group, in order to support clinical decision-making.

View Article and Find Full Text PDF

Background: Cancer-associated cachexia can inhibit immune checkpoint inhibitor (ICI) therapy efficacy. Cachexia's effect on ICI therapy has not been studied in large cohorts of cancer patients aside from lung cancer. We studied associations between real-world routinely collected clinical cachexia markers and disability-free, hospitalization-free and overall survival of cancer patients.

View Article and Find Full Text PDF

Purpose: The incidence of cancer, which is a serious public health concern, is increasing. A predictive analysis driven by machine learning was integrated with haematology parameters to create a method for the simultaneous diagnosis of several malignancies at different stages.

Patients And Methods: We analysed a newly collected dataset from various hospitals in Jordan comprising 19,537 laboratory reports (6,280 cancer and 13,257 noncancer cases).

View Article and Find Full Text PDF

Background: Infections by and influenza viruses are vaccine-preventable diseases causing great morbidity and mortality. We evaluated pneumococcal and influenza vaccination practices during pre-international travel health consultations.

Methods: We evaluated data on pretravel visits over a 10-year period (1 July 2012 through 31 June 2022) from 31 sites in Global TravEpiNet (GTEN), a consortium of US healthcare facilities providing pretravel health consultations.

View Article and Find Full Text PDF

Antimicrobial resistance in bacteria has been associated with significant morbidity and mortality in hospitalized patients. In the era of big data and of the consequent frequent need for large study populations, manual collection of data for research studies on antimicrobial resistance and antibiotic use has become extremely time-consuming and sometimes impossible to be accomplished by overwhelmed healthcare personnel. In this review, we discuss relevant concepts pertaining to the automated extraction of antibiotic resistance and antibiotic prescription data from laboratory information systems and electronic health records to be used in clinical studies, starting from the currently available literature on the topic.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!