Background: In bioprocess development, the needs of data analysis include (1) getting overview to existing data sets, (2) identifying primary control parameters, (3) determining a useful control direction, and (4) planning future experiments. In particular, the integration of multiple data sets causes that these needs cannot be properly addressed by regression models that assume linear input-output relationship or unimodality of the response function. Regularized regression and random forests, on the other hand, have several properties that may appear important in this context. They are capable, e.g., in handling small number of samples with respect to the number of variables, feature selection, and the visualization of response surfaces in order to present the prediction results in an illustrative way.
Results: In this work, the applicability of regularized regression (Lasso) and random forests (RF) in bioprocess data mining was examined, and their performance was benchmarked against multiple linear regression. As an example, we used data from a culture media optimization study for microbial hydrogen production. All the three methods were capable in providing a significant model when the five variables of the culture media optimization were linearly included in modeling. However, multiple linear regression failed when also the multiplications and squares of the variables were included in modeling. In this case, the modeling was still successful with Lasso (correlation between the observed and predicted yield was 0.69) and RF (0.91).
Conclusion: We found that both regularized regression and random forests were able to produce feasible models, and the latter was efficient in capturing the non-linearity in the data. In this kind of a data mining task of bioprocess data, both methods outperform multiple linear regression.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3750505 | PMC |
http://dx.doi.org/10.1186/1752-0509-7-S1-S5 | DOI Listing |
Nutr J
January 2025
Division of Clinical Epidemiology, Department of Medicine Solna, Karolinska Institutet, Eugeniahemmet T2:02, Stockholm, SE-171 76, Sweden.
Background: mHealth, i.e. mobile-health, strategies may be used as a complement to regular care to support healthy dietary habits in primary care patients.
View Article and Find Full Text PDFBMC Infect Dis
January 2025
Department of Epidemiology, School of Public Health, Shahroud University of Medical Sciences, Shahroud, Iran.
Introduction: Cutaneous Leishmaniasis (CL) is a zoonosis infection which is endemic in more than 100 countries in Asia, Africa, Europe and America. It was estimated that nearly 20 thousand of new cases are reported in Iran annually. This study aimed to investigate the impact of floods on the incidence of leishmaniasis in Golestan province (northeast of Iran) over nine years, from 2015 to 2023.
View Article and Find Full Text PDFJ Cancer Surviv
January 2025
School of Pharmacy, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China.
Purpose: This study aimed to evaluate the prevalence and predictors of frailty and the association between frailty and neurocognitive impairments among Chinese survivors of childhood cancer.
Methods: A total of 185 survivors of childhood cancer were recruited from a long-term follow-up clinic in Hong Kong (response rate: 94.4%; 48.
Arch Gerontol Geriatr
December 2024
Department of Neurology, The First Center of the PLA General Hospital, Beijing 100853, PR China. Electronic address:
Background: The association between tea consumption, especially different types, and cognitive function has not been adequately explored. This study aimed to investigate the associations of tea consumption, including status, frequency, and type, with cognitive function, considering selection bias.
Methods: We used data from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) in 2018(N = 8498).
Environ Sci Pollut Res Int
January 2025
Department of Water Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
Groundwater resources constitute one of the primary sources of freshwater in semi-arid and arid climates. Monitoring the groundwater quality is an essential component of environmental management. In this study, a comprehensive comparison was conducted to analyze the performance of nine ensembles and regular machine learning (ML) methods in predicting two water quality parameters including total dissolved solids (TDS) and pH, in an area with semi-arid climate conditions.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!