Publications by Tarak N Nandi

Publications by authors named "Tarak N Nandi"

Page 1 of 1

Quantized multi-task learning for context-specific representations of gene network dynamics.

Han Chen Madhavan S Venkatesh Javier Gómez Ortega Siddharth V Mahesh Tarak N Nandi

bioRxiv

August 2024

While often represented as static entities, gene networks are highly context-dependent. Here, we developed a multi-task learning strategy to yield context-specific representations of gene network dynamics. We assembled a corpus comprising ~103 million human single-cell transcriptomes from a broad range of tissues and diseases and performed a two stage pretraining, first with non-malignant cells to generate a foundational model and then with continual learning on cancer cells to tune the model to the cancer domain.

View Article and Find Full Text PDF

Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program.

Anurag Verma Jennifer E Huffman Alex Rodriguez Mitchell Conery Molei Liu Tarak Nath Nandi

Science

July 2024

Article Synopsis

Human genetic studies often lack diversity, which limits understanding of disease causes and health disparities.
The Department of Veterans Affairs Million Veteran Program analyzed data from a diverse group of 635,969 veterans, revealing 13,672 genomic risk loci, with significant findings particularly from non-European populations.
The research identified causal variants across 613 traits, showing that genetic similarities exist across populations and emphasizing the importance of including underrepresented groups in genetic research.

View Article and Find Full Text PDF

Accelerating Genome- and Phenome-Wide Association Studies using GPUs - A case study using data from the Million Veteran Program.

Alex Rodriguez Youngdae Kim Tarak Nath Nandi Karl Keat Rachit Kumar

bioRxiv

May 2024

The expansion of biobanks has significantly propelled genomic discoveries yet the sheer scale of data within these repositories poses formidable computational hurdles, particularly in handling extensive matrix operations required by prevailing statistical frameworks. In this work, we introduce computational optimizations to the SAIGE (Scalable and Accurate Implementation of Generalized Mixed Model) algorithm, notably employing a GPU-based distributed computing approach to tackle these challenges. We applied these optimizations to conduct a large-scale genome-wide association study (GWAS) across 2,068 phenotypes derived from electronic health records of 635,969 diverse participants from the Veterans Affairs (VA) Million Veteran Program (MVP).

View Article and Find Full Text PDF

Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models.

Francisco Carrillo-Perez Marija Pizurica Yuanning Zheng Tarak Nath Nandi Ravi Madduri

Nat Biomed Eng

March 2024

Training machine-learning models with synthetically generated data can alleviate the problem of data scarcity when acquiring diverse and sufficiently large datasets is costly and challenging. Here we show that cascaded diffusion models can be used to synthesize realistic whole-slide image tiles from latent representations of RNA-sequencing data from human tumours. Alterations in gene expression affected the composition of cell types in the generated synthetic image tiles, which accurately preserved the distribution of cell types and maintained the cell fraction observed in bulk RNA-sequencing data, as we show for lung adenocarcinoma, kidney renal papillary cell carcinoma, cervical squamous cell carcinoma, colon adenocarcinoma and glioblastoma.

View Article and Find Full Text PDF

Diversity and Scale: Genetic Architecture of 2,068 Traits in the VA Million Veteran Program.

Anurag Verma Jennifer E Huffman Alex Rodriguez Mitchell Conery Molei Liu Tarak Nath Nandi

medRxiv

June 2023

Genome-wide association studies (GWAS) have underrepresented individuals from non-European populations, impeding progress in characterizing the genetic architecture and consequences of health and disease traits. To address this, we present a population-stratified phenome-wide GWAS followed by a multi-population meta-analysis for 2,068 traits derived from electronic health records of 635,969 participants in the Million Veteran Program (MVP), a longitudinal cohort study of diverse U.S.

View Article and Find Full Text PDF

RNA-to-image multi-cancer synthesis using cascaded diffusion models.

Francisco Carrillo-Perez Marija Pizurica Yuanning Zheng Tarak Nath Nandi Ravi Madduri

bioRxiv

July 2023

Data scarcity presents a significant obstacle in the field of biomedicine, where acquiring diverse and sufficient datasets can be costly and challenging. Synthetic data generation offers a potential solution to this problem by expanding dataset sizes, thereby enabling the training of more robust and generalizable machine learning models. Although previous studies have explored synthetic data generation for cancer diagnosis, they have predominantly focused on single modality settings, such as whole-slide image tiles or RNA-Seq data.

View Article and Find Full Text PDF

Non-steady wind turbine response to daytime atmospheric turbulence.

Tarak N Nandi Andreas Herrig James G Brasseur

Philos Trans A Math Phys Eng Sci

April 2017

Relevant to drivetrain bearing fatigue failures, we analyse non-steady wind turbine responses from interactions between energy-dominant daytime atmospheric turbulence eddies and the rotating blades of a GE 1.5 MW wind turbine using a unique dataset from a GE field experiment and computer simulation. Time-resolved local velocity data were collected at the leading and trailing edges of an instrumented blade together with generator power, revolutions per minute, pitch and yaw.

View Article and Find Full Text PDF