Motivation: Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired.

Results: We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information.

Availability And Implementation: Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs.

Contact: cuiping@stanford.edu or ptsao@stanford.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860318PMC
http://dx.doi.org/10.1093/bioinformatics/btx468DOI Listing

Publication Analysis

Top Keywords

interactive analytics
8
data
6
genomic
5
cloud-based interactive
4
analytics terabytes
4
terabytes genomic
4
genomic variants
4
variants data
4
data motivation
4
motivation large
4

Similar Publications

Determinants of vacancy formation and migration in high-entropy alloys.

Sci Adv

January 2025

Key Laboratory of Automobile Materials (Jilin University), Ministry of Education, Department of Materials Science and Engineering, Jilin University, Changchun 130022, China.

Vacancies are crucial for the radiation resistance, strength, and ductility of high-entropy alloys (HEAs). However, complex electronic interactions resulting from chemical disorder prohibit the quantification of vacancy formation energy () and migration barriers (). Herein, we propose an electronic descriptor χ (electronegativity χ and valence-electron number ) to quantify the bonding strength of constituents on the basis of the tight-binding model, which allows us to build analytical models to achieve the site-to-site quantification of and .

View Article and Find Full Text PDF

We propose a novel approach to investigate the brain mechanisms that support coordination of behavior between individuals. Brain states in single individuals defined by the patterns of functional connectivity between brain regions are used to create joint symbolic representations of brain states in two or more individuals to investigate symbolic dynamics that are related to interactive behaviors. We apply this approach to electroencephalographic data from pairs of subjects engaged in two different modes of finger-tapping coordination tasks (synchronization and syncopation) under different interaction conditions (uncoupled, leader-follower, and mutual) to explore the neural mechanisms of multi-person motor coordination.

View Article and Find Full Text PDF

In this work we analytically investigate the alignment mechanism of self-propelled ellipse-shaped cells in two spatial dimensions interacting via overlap avoidance. By considering a two-cell system and imposing certain symmetries, we obtain an analytically tractable dynamical system, which we mathematically analyse in detail. We find that for elongated cells there is a half-stable steady state corresponding to perfect alignment between the cells.

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.

Background: Recent studies suggest genome-wide-association-studies (GWAS) loci confer their effects on microglia in late-onset Alzheimer's disease (LOAD) brains. Relatively fewer studies have investigated the effects of other genome-wide significant loci (p<5e) using human neurons.

Method: GWAS itself cannot directly identify causal variant-(effector)gene-pairs as GWAS only reports the sentinel variant at a given locus.

View Article and Find Full Text PDF

Background: Some research suggests that multilingualism confers a cognitive advantage, but this association may be confounded by linguistic and socioeconomic factors. Multilinguals can differ on their first language such that it could be a societal majority- or minority-language, resulting in distinct reasons for- and experiences with- becoming/being multilingual, along with different socioeconomic opportunities. We evaluated the association of multilingualism (Monolingual, Multilingual-Majority, Multilingual-Minority) and socioeconomic status (SES) on cognition among older adults.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!