Publications by Vaibhav Rajan

Publications by authors named "Vaibhav Rajan"

Page 1 of 2

Evaluating Explanations From AI Algorithms for Clinical Decision-Making: A Social Science-Based Approach.

IEEE J Biomed Health Inform

July 2024

Explainable Artificial Intelligence (XAI) techniques generate explanations for predictions from AI models. These explanations can be evaluated for (i) faithfulness to the prediction, i.e.

View Article and Find Full Text PDF

ASTER: A Method to Predict Clinically Relevant Synthetic Lethal Genetic Interactions.

Herty Liany Aishwarya Jayagopal Dachuan Huang Jing Quan Lim Nur Izzah Nbh Vaibhav Rajan

IEEE J Biomed Health Inform

March 2024

A Synthetic Lethal (SL) interaction is a functional relationship between two genes or functional entities where the loss of either entity is viable but the loss of both is lethal. Such pairs can be used to develop targeted anticancer therapies with fewer side effects and reduced overtreatment. However, finding clinically relevant SL interactions remains challenging.

View Article and Find Full Text PDF

Avoiding inferior clusterings with misspecified Gaussian mixture models.

Siva Rajesh Kasa Vaibhav Rajan

Sci Rep

November 2023

Clustering is a fundamental tool for exploratory data analysis, and is ubiquitous across scientific disciplines. Gaussian Mixture Model (GMM) is a popular probabilistic and interpretable model for clustering. In many practical settings, the true data distribution, which is unknown, may be non-Gaussian and may be contaminated by noise or outliers.

View Article and Find Full Text PDF

ExpertNet: A Deep Learning Approach to Combined Risk Modeling and Subtyping in Intensive Care Units.

Shivin Srivastava Vaibhav Rajan

IEEE J Biomed Health Inform

October 2023

Risk models play a crucial role in disease prevention, particularly in intensive care units (ICUs). Diseases often have complex manifestations with heterogeneous subpopulations, or subtypes, that exhibit distinct clinical characteristics. Risk models that explicitly model subtypes have high predictive accuracy and facilitate subtype-specific personalization.

View Article and Find Full Text PDF

scMoMaT jointly performs single cell mosaic integration and multi-modal bio-marker detection.

Ziqi Zhang Haoran Sun Ragunathan Mariappan Xi Chen Xinyu Chen Vaibhav Rajan

Nat Commun

January 2023

Single cell data integration methods aim to integrate cells across data batches and modalities, and data integration tasks can be categorized into horizontal, vertical, diagonal, and mosaic integration, where mosaic integration is the most general and challenging case with few methods developed. We propose scMoMaT, a method that is able to integrate single cell multi-omics data under the mosaic integration scenario using matrix tri-factorization. During integration, scMoMaT is also able to uncover the cluster specific bio-markers across modalities.

View Article and Find Full Text PDF

Neural Collective Matrix Factorization for integrated analysis of heterogeneous biomedical data.

Ragunathan Mariappan Aishwarya Jayagopal Ho Zong Sien Vaibhav Rajan

Bioinformatics

September 2022

Motivation: In many biomedical studies, there arises the need to integrate data from multiple directly or indirectly related sources. Collective matrix factorization (CMF) and its variants are models designed to collectively learn from arbitrary collections of matrices. The latent factors learnt are rich integrative representations that can be used in downstream tasks, such as clustering or relation prediction with standard machine-learning models.

View Article and Find Full Text PDF

Patient Representation Learning From Heterogeneous Data Sources and Knowledge Graphs Using Deep Collective Matrix Factorization: Evaluation Study.

Sajit Kumar Alicia Nanelia Ragunathan Mariappan Adithya Rajagopal Vaibhav Rajan

JMIR Med Inform

January 2022

Background: Patient representation learning aims to learn features, also called representations, from input sources automatically, often in an unsupervised manner, for use in predictive models. This obviates the need for cumbersome, time- and resource-intensive manual feature engineering, especially from unstructured data such as text, images, or graphs. Most previous techniques have used neural network-based autoencoders to learn patient representations, primarily from clinical notes in electronic medical records (EMRs).

View Article and Find Full Text PDF

An Algorithm to Mine Therapeutic Motifs for Cancer From Networks of Genetic Interactions.

Herty Liany Yu Lin Anand Jeyasekharan Vaibhav Rajan

IEEE J Biomed Health Inform

June 2022

Study of pairwise genetic interactions, such as mutually exclusive mutations, has led to understanding of underlying mechanisms in cancer. Investigation of various combinatorial motifs within networks of such interactions can lead to deeper insights into its mutational landscape and inform therapy development. One such motif called the Between-Pathway Model (BPM) represents redundant or compensatory pathways that can be therapeutically exploited.

View Article and Find Full Text PDF

Adverse Drug Event Prediction Using Noisy Literature-Derived Knowledge Graphs: Algorithm Development and Validation.

Soham Dasgupta Aishwarya Jayagopal Abel Lim Jun Hong Ragunathan Mariappan Vaibhav Rajan

JMIR Med Inform

October 2021

Background: Adverse drug events (ADEs) are unintended side effects of drugs that cause substantial clinical and economic burdens globally. Not all ADEs are discovered during clinical trials; therefore, postmarketing surveillance, called pharmacovigilance, is routinely conducted to find unknown ADEs. A wealth of information, which facilitates ADE discovery, lies in the growing body of biomedical literature.

View Article and Find Full Text PDF

Maximum likelihood reconstruction of ancestral networks by integer linear programming.

Vaibhav Rajan Ziqi Zhang Carl Kingsford Xiuwei Zhang

Bioinformatics

May 2021

Motivation: The study of the evolutionary history of biological networks enables deep functional understanding of various bio-molecular processes. Network growth models, such as the Duplication-Mutation with Complementarity (DMC) model, provide a principled approach to characterizing the evolution of protein-protein interactions (PPIs) based on duplication and divergence. Current methods for model-based ancestral network reconstruction primarily use greedy heuristics and yield sub-optimal solutions.

View Article and Find Full Text PDF

MetaBCC-LR: metagenomics binning by coverage and composition for long reads.

Anuradha Wickramarachchi Vijini Mallawaarachchi Vaibhav Rajan Yu Lin

Bioinformatics

July 2020

Motivation: Metagenomics studies have provided key insights into the composition and structure of microbial communities found in different environments. Among the techniques used to analyse metagenomic data, binning is considered a crucial step to characterize the different species of micro-organisms present. The use of short-read data in most binning tools poses several limitations, such as insufficient species-specific signal, and the emergence of long-read sequencing technologies offers us opportunities to surmount them.

View Article and Find Full Text PDF

Predicting synthetic lethal interactions using heterogeneous data sources.

Herty Liany Anand Jeyasekharan Vaibhav Rajan

Bioinformatics

April 2020

Motivation: A synthetic lethal (SL) interaction is a relationship between two functional entities where the loss of either one of the entities is viable but the loss of both entities is lethal to the cell. Such pairs can be used as drug targets in targeted anticancer therapies, and so, many methods have been developed to identify potential candidate SL pairs. However, these methods use only a subset of available data from multiple platforms, at genomic, epigenomic and transcriptomic levels; and hence are limited in their ability to learn from complex associations in heterogeneous data sources.

View Article and Find Full Text PDF

Gaussian mixture copulas for high-dimensional clustering and dependency-based subtyping.

Siva Rajesh Kasa Sakyajit Bhattacharya Vaibhav Rajan

Bioinformatics

January 2020

Motivation: The identification of sub-populations of patients with similar characteristics, called patient subtyping, is important for realizing the goals of precision medicine. Accurate subtyping is crucial for tailoring therapeutic strategies that can potentially lead to reduced mortality and morbidity. Model-based clustering, such as Gaussian mixture models, provides a principled and interpretable methodology that is widely used to identify subtypes.

View Article and Find Full Text PDF

User Models for Personalized Physical Activity Interventions: Scoping Review.

Suparna Ghanvatkar Atreyi Kankanhalli Vaibhav Rajan

JMIR Mhealth Uhealth

January 2019

Background: Fitness devices have spurred the development of apps that aim to motivate users, through interventions, to increase their physical activity (PA). Personalization in the interventions is essential as the target users are diverse with respect to their activity levels, requirements, preferences, and behavior.

Objective: This review aimed to (1) identify different kinds of personalization in interventions for promoting PA among any type of user group, (2) identify user models used for providing personalization, and (3) identify gaps in the current literature and suggest future research directions.

View Article and Find Full Text PDF

A dual boundary classifier for predicting acute hypotensive episodes in critical care.

Sakyajit Bhattacharya Vijay Huddar Vaibhav Rajan Chandan K Reddy

PLoS One

May 2018

An Acute Hypotensive Episode (AHE) is the sudden onset of a sustained period of low blood pressure and is one among the most critical conditions in Intensive Care Units (ICU). Without timely medical care, it can lead to an irreversible organ damage and death. By identifying patients at risk for AHE early, adequate medical intervention can save lives and improve patient outcomes.

View Article and Find Full Text PDF

Prediction and imputation in irregularly sampled clinical time series data using hierarchical linear dynamical models.

Abhishek Sengupta Prathosh Ap Satya Narayan Shukla Vaibhav Rajan Chandan K Reddy

Annu Int Conf IEEE Eng Med Biol Soc

July 2017

Clinical time series, comprising of repeated clinical measurements provide valuable information of the trajectory of patients' condition. Linear dynamical systems (LDS) are used extensively in science and engineering for modeling time series data. The observation and state variables in LDS are assumed to be uniformly sampled in time with a fixed sampling rate.

View Article and Find Full Text PDF

A statistical model for stroke outcome prediction and treatment planning.

Abhishek Sengupta Vaibhav Rajan Sakyajit Bhattacharya G R K Sarma

Annu Int Conf IEEE Eng Med Biol Soc

August 2016

Stroke is a major cause of mortality and long-term disability in the world. Predictive outcome models in stroke are valuable for personalized treatment, rehabilitation planning and in controlled clinical trials. We design a new multi-class classification model to predict outcome in the short-term, the putative therapeutic window for several treatments.

View Article and Find Full Text PDF

Analysis of gene copy number changes in tumor phylogenetics.

Jun Zhou Yu Lin Vaibhav Rajan William Hoskins Bing Feng

Algorithms Mol Biol

September 2016

Backgound: Evolution of cancer cells is characterized by large scale and rapid changes in the chromosomal landscape. The fluorescence in situ hybridization (FISH) technique provides a way to measure the copy numbers of preselected genes in a group of cells and has been found to be a reliable source of data to model the evolution of tumor cells. Chowdhury et al.

View Article and Find Full Text PDF

Predicting postoperative acute respiratory failure in critical care using nursing notes and physiological signals.

Vijay Huddar Vaibhav Rajan Sakyajit Bhattacharya Shourya Roy

Annu Int Conf IEEE Eng Med Biol Soc

September 2015

Postoperative Acute Respiratory Failure (ARF) is a serious complication in critical care affecting patient morbidity and mortality. In this paper we investigate a novel approach to predicting ARF in critically ill patients. We study the use of two disparate sources of information – semi-structured text contained in nursing notes and investigative reports that are regularly recorded and the respiration rate, a physiological signal that is continuously monitored during a patient's ICU stay.

View Article and Find Full Text PDF

A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.

Vaibhav Rajan

Mol Biol Evol

March 2013

Inaccurate inference of positional homologies in multiple sequence alignments and systematic errors introduced by alignment heuristics obfuscate phylogenetic inference. Alignment masking, the elimination of phylogenetically uninformative or misleading sites from an alignment before phylogenetic analysis, is a common practice in phylogenetic analysis. Although masking is often done manually, automated methods are necessary to handle the much larger data sets being prepared today.

View Article and Find Full Text PDF

TIBA: a tool for phylogeny inference from rearrangement data with bootstrap analysis.

Yu Lin Vaibhav Rajan Bernard M E Moret

Bioinformatics

December 2012

TIBA is a tool to reconstruct phylogenetic trees from rearrangement data that consist of ordered lists of synteny blocks (or genes), where each synteny block is shared with all of its homologues in the input genomes. The evolution of these synteny blocks, through rearrangement operations, is modelled by the uniform Double-Cut-and-Join model. Using a true distance estimate under this model and simple distance-based methods, TIBA reconstructs a phylogeny of the input genomes.

View Article and Find Full Text PDF

Bootstrapping phylogenies inferred from rearrangement data.

Yu Lin Vaibhav Rajan Bernard Me Moret

Algorithms Mol Biol

August 2012

Background: Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date.

View Article and Find Full Text PDF

A metric for phylogenetic trees based on matching.

Yu Lin Vaibhav Rajan Bernard M E Moret

IEEE/ACM Trans Comput Biol Bioinform

October 2012

Comparing two or more phylogenetic trees is a fundamental task in computational biology. The simplest outcome of such a comparison is a pairwise measure of similarity, dissimilarity, or distance. A large number of such measures have been proposed, but so far all suffer from problems varying from computational cost to lack of robustness; many can be shown to behave unexpectedly under certain plausible inputs.

View Article and Find Full Text PDF

Sorting signed permutations by inversions in O(nlogn) time.

Krister M Swenson Vaibhav Rajan Yu Lin Bernard M E Moret

J Comput Biol

March 2010

The study of genomic inversions (or reversals) has been a mainstay of computational genomics for nearly 20 years. After the initial breakthrough of Hannenhalli and Pevzner, who gave the first polynomial-time algorithm for sorting signed permutations by inversions, improved algorithms have been designed, culminating with an optimal linear-time algorithm for computing the inversion distance and a subquadratic algorithm for providing a shortest sequence of inversions--also known as sorting by inversions. Remaining open was the question of whether sorting by inversions could be done in O(nlogn) time.

View Article and Find Full Text PDF

Estimating true evolutionary distances under rearrangements, duplications, and losses.

Yu Lin Vaibhav Rajan Krister M Swenson Bernard M E Moret

BMC Bioinformatics

January 2010

Background: The rapidly increasing availability of whole-genome sequences has enabled the study of whole-genome evolution. Evolutionary mechanisms based on genome rearrangements have attracted much attention and given rise to many models; somewhat independently, the mechanisms of gene duplication and loss have seen much work. However, the two are not independent and thus require a unified treatment, which remains missing to date.

View Article and Find Full Text PDF