Data originating from biomedical experiments has provided machine learning researchers with an important source of motivation for developing and evaluating new algorithms. A new wave of algorithmic development has been initiated with the publication of gene expression data derived from microarrays. Microarray data analysis is particularly challenging given the large number of measurements (typically in the order of thousands) that are reported for relatively few samples (typically in the order of dozens). Many data sets are now available on the web. It is important that machine learning researchers understand how data are obtained and which assumptions are necessary in the analysis. Microarray data have the potential to cause significant impact in machine learning research, not just as a rich and realistic source of cases for testing new algorithms, as has been the UCI machine learning repository in the past decades, but also as a main motivation for their development. In this article, we briefly review the biology underlying microarrays, the process of obtaining gene expression measurements, and the rationale behind the common types of analyses involved in a microarray experiment. We outline the main challenges and reiterate critical considerations regarding the construction of supervised learning models that use this type of data. The goal of this article is to familiarize machine learning researchers with data originated from gene expression microarrays.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2004.07.002DOI Listing

Publication Analysis

Top Keywords

machine learning
24
gene expression
16
learning researchers
16
expression microarrays
8
data
8
researchers data
8
microarray data
8
typically order
8
learning
7
machine
6

Similar Publications

Neurodevelopmental impairments associated with congenital heart disease (CHD) may arise from perturbations in brain developmental pathways, including the formation of sulcal patterns. While genetic factors contribute to sulcal features, the association of noncoding variants (ncDNVs) with sulcal patterns in people with CHD remains poorly understood. Leveraging deep learning models, we examined the predicted impact of ncDNVs on gene regulatory signals.

View Article and Find Full Text PDF

Objective: The vicious circle model of obesity proposes that the hippocampus plays a crucial role in food reward processing and obesity. However, few studies focused on whether and how pediatric obesity influences the potential direction of information exchange between the hippocampus and key regions, as well as whether these alterations in neural interaction could predict future BMI and eating behaviors.

Methods: In this longitudinal study, a total of 39 children with excess weight (overweight/obesity) and 51 children with normal weight, aged 8 to 12, underwent resting-state fMRI.

View Article and Find Full Text PDF

Background And Aims: Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement.

View Article and Find Full Text PDF

Understanding the oxygen reduction reaction (ORR) mechanism and accurately characterizing the reaction interface are essential for improving fuel cell efficiency. We developed an active learning framework combining machine learning force fields and enhanced sampling to explore the dynamics and kinetics of the ORR on Fe-N/C using a fully explicit solvent model. Different possible reaction paths have been explored and the O adsorption process is confirmed as the rate-determining step of the ORR at the Fe-N/C-water interface, which needs to overcome a free energy barrier of 0.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!