The imputation of missing values (IMV) in electronic health records tabular data is crucial to enable machine learning for patient-specific predictive modeling. While IMV methods are developed in biostatistics and recently in machine learning, deep learning-based solutions have shown limited success in learning tabular data. This paper proposes a novel attention-based missing value imputation framework that learns to reconstruct data with missing values leveraging between-feature (self-attention) or between-sample attentions. We adopt data manipulation methods used in contrastive learning to improve the generalization of the trained imputation model. The proposed self-attention imputation method outperforms state-of-the-art statistical and machine learning-based (decision-tree) imputation methods, reducing the normalized root mean squared error by 18.4% to 74.7% on five tabular data sets and 52.6% to 82.6% on two electronic health records data sets. The proposed attention-based missing value imputation method shows superior performance across a wide range of missingness (10% to 50%) when the values are missing completely at random.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11463999 | PMC |
http://dx.doi.org/10.1109/ichi61247.2024.00030 | DOI Listing |
Database (Oxford)
January 2025
College of Big Data, Yunnan Agricultural University, 452 Fengyuan Road, Panlong District, Kunming, Yunnan 650201, China.
Lanping black-boned (LPBB) sheep are a unique and rare ruminant species, characterized by black pigmentation in the skin and internal organs. Thus far, LPBB are the only known animal with heritable melanin characteristics besides the black-boned chicken, and the only mammal known to contain a large amount of melanin in the body. LPBB have therefore attracted substantial research attention, due to their potential contribution to medicine.
View Article and Find Full Text PDFIn cybersecurity, anomaly detection in tabular data is essential for ensuring information security. While traditional machine learning and deep learning methods have shown some success, they continue to face significant challenges in terms of generalization. To address these limitations, this paper presents an innovative method for tabular data anomaly detection based on large language models, called "Tabular Anomaly Detection via Guided Prompts" (TAD-GP).
View Article and Find Full Text PDFPulm Ther
January 2025
US Medical Affairs, GSK, ATC Fowler Building, 410 Blackwell Street, Durham, NC, 27701, USA.
Introduction: Escalation to single- or multiple-inhaler triple therapy (SITT; MITT) is a recommended option for patients with asthma who remain uncontrolled by medium-dose inhaled corticosteroid/long-acting β-agonist; however, characterization of elderly users of triple therapy is limited. This real-world cohort study describes demographics and clinical characteristics of elderly patients with asthma with and without comorbid chronic obstructive pulmonary disease (COPD) who are new users of triple therapy, and asthma treatment patterns preceding triple therapy initiation.
Methods: This retrospective cohort study used administrative claims data from the Optum Clinformatics Data Mart database.
JMIR Res Protoc
January 2025
Department of Environmental and Prevention Sciences, University of Ferrara, Ferrara, Italy.
Background: Workers may be exposed to different infectious agents, putting them at risk of developing occupational diseases. This can occur in many ways, through deliberate use of specific microorganisms or through potential exposure from close contact with biological material. Infection prevention and control measures against biohazards can reduce the risk of infection among workers.
View Article and Find Full Text PDFProc (IEEE Conf Multimed Inf Process Retr)
August 2024
Department of Computer Science, University of Kentucky, Lexington, KY, USA.
Despite the prevalence of images and texts in machine learning, tabular data remains widely used across various domains. Existing deep learning models, such as convolutional neural networks and transformers, perform well however demand extensive preprocessing and tuning limiting accessibility and scalability. This work introduces an innovative approach based on a structured state-space model (SSM), MambaTab, for tabular data.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!