Semi-supervised Learning for Phenotyping Tasks.

Dmitriy Dligach Timothy Miller Guergana K Savova

AMIA Annu Symp Proc

Boston Children's Hospital and Harvard Medical School, Boston, MA.

Published: February 2018

Supervised learning is the dominant approach to automatic electronic health records-based phenotyping, but it is expensive due to the cost of manual chart review. Semi-supervised learning takes advantage of both scarce labeled and plentiful unlabeled data. In this work, we study a family of semi-supervised learning algorithms based on Expectation Maximization (EM) in the context of several phenotyping tasks. We first experiment with the basic EM algorithm. When the modeling assumptions are violated, basic EM leads to inaccurate parameter estimation. Augmented EM attenuates this shortcoming by introducing a weighting factor that downweights the unlabeled data. Cross-validation does not always lead to the best setting of the weighting factor and other heuristic methods may be preferred. We show that accurate phenotyping models can be trained with only a few hundred labeled (and a large number of unlabeled) examples, potentially providing substantial savings in the amount of the required manual chart review.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765699	PMC

Publication Analysis

Top Keywords

semi-supervised learning

phenotyping tasks

manual chart

chart review

unlabeled data

weighting factor

phenotyping

learning phenotyping

tasks supervised

supervised learning

Similar Publications

Comparison of Different Machine Learning Methodologies for Predicting the Non-Specific Treatment Response in Placebo Controlled Major Depressive Disorder Clinical Trials.

Clin Transl Sci

January 2025

Pharmacometrica, La Fouillade, France.

Roberto Gomeni Françoise Bressolle-Gomeni

Placebo effect represents a serious confounder for the assessment of treatment effect to the extent that it has become increasingly difficult to develop antidepressant medications appropriate for outperforming placebo. Treatment effect in randomized, placebo-controlled trials, is usually estimated by the mean baseline adjusted difference of treatment response in active and placebo arms and is function of treatment-specific and non-specific effects. The non-specific treatment effect varies subject by subject conditional to the individual propensity to respond to placebo.

View Article and Find Full Text PDF

Similar Publications

A semi-supervised deep neuro-fuzzy iterative learning system for automatic segmentation of hippocampus brain MRI.

Math Biosci Eng

December 2024

Department of Electronics and Communication Engineering, Akshaya College of Engineering and Technology, Coimbatore, Tamil Nadu, India.

M Nisha T Kannan K Sivasankari

The hippocampus is a small, yet intricate seahorse-shaped tiny structure located deep within the brain's medial temporal lobe. It is a crucial component of the limbic system, which is responsible for regulating emotions, memory, and spatial navigation. This research focuses on automatic hippocampus segmentation from Magnetic Resonance (MR) images of a human head with high accuracy and fewer false positive and false negative rates.

View Article and Find Full Text PDF

Similar Publications

Comparative analysis of regression algorithms for drug response prediction using GDSC dataset.

BMC Res Notes

January 2025

Department of Computer Engineering, Chungbuk National University, Chungdae-ro 1, Cheongju, 28644, Republic of Korea.

Soojung Ha Juho Park Kyuri Jo

Background: Drug response prediction can infer the relationship between an individual's genetic profile and a drug, which can be used to determine the choice of treatment for an individual patient. Prediction of drug response is recently being performed using machine learning technology. However, high-throughput sequencing data produces thousands of features per patient.

View Article and Find Full Text PDF

Similar Publications

Supervised machine learning statistical models for visual outcome prediction in macular hole surgery: a single-surgeon, standardized surgery study.

Int J Retina Vitreous

January 2025

Department of Retina and Vitreous, Narayana Nethralaya, #121/C, 1st R Block, Chord Road, Rajaji Nagar, Bengaluru, 560010, India.

Kanika Godani Vishma Prabhu Priyanka Gandhi Ayushi Choudhary Shubham Darade

Purpose: To evaluate the predictive accuracy of various machine learning (ML) statistical models in forecasting postoperative visual acuity (VA) outcomes following macular hole (MH) surgery using preoperative optical coherence tomography (OCT) parameters.

Methods: This retrospective study included 158 eyes (151 patients) with full-thickness MHs treated between 2017 and 2023 by the same surgeon and using the same intraoperative surgical technique. Data from electronic medical records and OCT scans were extracted, with OCT-derived qualitative and quantitative MH characteristics recorded.

View Article and Find Full Text PDF

Similar Publications

Classification Prediction of Hydrocephalus After Intercerebral Haemorrhage Based on Machine Learning Approach.

Neuroinformatics

January 2025

Department of Mathematics and Statistics, Changsha University of Science and Technology, Changsha, 410114, China.

Enwen Zhu Zhuojun Zou Jianxian Li Jipan Chen Ao Chen

In order to construct a clinical classification prediction model for hydrocephalus after intercerebral haemorrhage(ICH) to guide clinical treatment decisions, this paper retrospectively analyses the clinical data of 844 cases of ICH and hydrocephalus inpatients admitted to Yueyang People's Hospital from May 2019 to October 2022, of which 95 cases of hydrocephalus occurred after ICH and no hydrocephalus in 749 cases. The following indicators were compared between the two groups of patients: gender, age, Glasgow Coma Scale(GCS)score, whether the amount of bleeding was greater than 30 ml, whether it broke into the ventricle or not, modified Graeb score(MGS), modified Rankin Scale (MRS) score, whether surgery was performed or not, red blood cells, white blood cells, and platelets. After variable screening, the following six variables were selected: GCS score, MGS, MRS score, whether the bleeding volume was greater than 30 ml, whether it broke into the ventricle or not, and whether surgery was performed or not were modelled and analysed using logistic regression model and support vector machine model in machine learning.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!