Machine learning and related approaches in transcriptomics.

Biochem Biophys Res Commun

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia. Electronic address:

Published: September 2024

Data acquisition for transcriptomic studies used to be the bottleneck in the transcriptomic analytical pipeline. However, recent developments in transcriptome profiling technologies have increased researchers' ability to obtain data, resulting in a shift in focus to data analysis. Incorporating machine learning to traditional analytical methods allows the possibility of handling larger volumes of complex data more efficiently. Many bioinformaticians, especially those unfamiliar with ML in the study of human transcriptomics and complex biological systems, face a significant barrier stemming from their limited awareness of the current landscape of ML utilisation in this field. To address this gap, this review endeavours to introduce those individuals to the general types of ML, followed by a comprehensive range of more specific techniques, demonstrated through examples of their incorporation into analytical pipelines for human transcriptome investigations. Important computational aspects such as data pre-processing, task formulation, results (performance of ML models), and validation methods are encompassed. In hope of better practical relevance, there is a strong focus on studies published within the last five years, almost exclusively examining human transcriptomes, with outcomes compared with standard non-ML tools.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bbrc.2024.150225DOI Listing

Publication Analysis

Top Keywords

machine learning
8
data
5
learning approaches
4
approaches transcriptomics
4
transcriptomics data
4
data acquisition
4
acquisition transcriptomic
4
transcriptomic studies
4
studies bottleneck
4
bottleneck transcriptomic
4

Similar Publications

A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k).

View Article and Find Full Text PDF

Background: The treatment effects are heterogenous across patients due to the differences in their microbiomes, which in turn implies that we can enhance the treatment effect by manipulating the patient's microbiome profile. Then, the coadministration of microbiome-based dietary supplements/therapeutics along with the primary treatment has been the subject of intensive investigation. However, for this, we first need to comprehend which microbes help (or prevent) the treatment to cure the patient's disease.

View Article and Find Full Text PDF

Background: Postoperative fever (POF) is a common occurrence in patients undergoing major surgery, presenting challenges and burdens for both patients and surgeons yet. This study endeavors to examine the incidence, identify risk factors, and establish a machine learning-based predictive model for POF following surgery of oral cancer.

Methods: A total of seven hundred and twenty-seven consecutive patients undergoing radical resection of oral cancer were retrospectively investigated.

View Article and Find Full Text PDF

Background: Intrahepatic cholestasis of pregnancy (ICP) is a liver disorder that occurs in the second and third trimesters of pregnancy and is associated with a significant risk of fetal complications, including premature birth and fetal death. In clinical practice, the diagnosis of ICP is predominantly based on the presence of pruritus in pregnant women and elevated serum total bile acid. However, this approach may result in missed or delayed diagnoses.

View Article and Find Full Text PDF

Background: Creatinine-based estimated glomerular filtration rate (eGFR) equations are widely used in clinical practice but exhibit inherent limitations. On the other side, measuring GFR is time consuming and not available in routine clinical practice. We developed and validated machine learning models to assess the trustworthiness (i.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!