Publications by authors named "Yatish Jain"

Genomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases.

View Article and Find Full Text PDF

Genetic data is limited and generating new datasets is often an expensive, time-consuming process, involving countless moving parts to genotype and phenotype individuals. While sharing data is beneficial for quality control and software development, privacy and security are of utmost importance. Generating synthetic data is a practical solution to mitigate the cost, time and sensitivities that hamper developers and researchers in producing and validating novel biotechnological solutions to data intensive problems.

View Article and Find Full Text PDF

With the advancement of genomic engineering and genetic modification techniques, the uptake of computational tools to design guide RNA increased drastically. Searching for genomic targets to design guides with maximum on-target activity (efficiency) and minimum off-target activity (specificity) is now an essential part of genome editing experiments. Today, a variety of tools exist that allow the search of genomic targets and let users customize their search parameters to better suit their experiments.

View Article and Find Full Text PDF

New SARS-CoV-2 variants emerge as part of the virus' adaptation to the human host. The Health Organizations are monitoring newly emerging variants with suspected impact on disease or vaccination efficacy as Variants Being Monitored (VBM), like Delta and Omicron. Genetic changes (SNVs) compared to the Wuhan variant characterize VBMs with current emphasis on the spike protein and lineage markers.

View Article and Find Full Text PDF

Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data.

View Article and Find Full Text PDF

Detecting viral and vector integration events is a key step when investigating interactions between viral and host genomes. This is relevant in several fields, including virology, cancer research and gene therapy. For example, investigating integrations of wild-type viruses such as human papillomavirus and hepatitis B virus has proven to be crucial for understanding the role of these integrations in cancer.

View Article and Find Full Text PDF

Precise genomic modification using prime editing (PE) holds enormous potential for research and clinical applications. In this study, we generated all-in-one prime editing (PEA1) constructs that carry all the components required for PE, along with a selection marker. We tested these constructs (with selection) in HEK293T, K562, HeLa and mouse embryonic stem (ES) cells.

View Article and Find Full Text PDF

Complex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.

View Article and Find Full Text PDF

Being able to link clinical outcomes to SARS-CoV-2 virus strains is a critical component of understanding COVID-19. Here, we discuss how current processes hamper sustainable data collection to enable meaningful analysis and insights. Following the 'Fast Healthcare Interoperable Resource' (FHIR) implementation guide, we introduce an ontology-based standard questionnaire to overcome these shortcomings and describe patient 'journeys' in coordination with the World Health Organization's recommendations.

View Article and Find Full Text PDF

Background: Many traits and diseases are thought to be driven by >1 gene (polygenic). Polygenic risk scores (PRS) hence expand on genome-wide association studies by taking multiple genes into account when risk models are built. However, PRS only considers the additive effect of individual genes but not epistatic interactions or the combination of individual and interacting drivers.

View Article and Find Full Text PDF