The central hallmark of Parkinson's disease pathology is the aggregation of the α-synuclein protein, which, in its healthy form, is associated with lipid membranes. Purified monomeric α-synuclein is relatively stable , but its aggregation can be triggered by the presence of lipid vesicles. Despite this central importance of lipids in the context of α-synuclein aggregation, their detailed mechanistic role in this process has not been established to date.
View Article and Find Full Text PDFReducing premature mortality associated with age-related chronic diseases, such as cancer and cardiovascular disease, is an urgent priority. We report early results using genomics in combination with advanced imaging and other clinical testing to proactively screen for age-related chronic disease risk among adults. We enrolled active, symptom-free adults in a study of screening for age-related chronic diseases associated with premature mortality.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2017
Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
July 2017
The HLA gene complex on human chromosome 6 is one of the most polymorphic regions in the human genome and contributes in large part to the diversity of the immune system. Accurate typing of HLA genes with short-read sequencing data has historically been difficult due to the sequence similarity between the polymorphic alleles. Here, we introduce an algorithm, xHLA, that iteratively refines the mapping results at the amino acid level to achieve 99-100% four-digit typing accuracy for both class I and II HLA genes, taking only [Formula: see text]3 min to process a 30× whole-genome BAM file on a desktop computer.
View Article and Find Full Text PDFIn this work, we describe the computational ("in silico") mode-of-action analysis of CNS-active drugs, which is taking both multiple simultaneous hypotheses as well as sets of protein targets for each mode-of-action into account, and which was followed by successful prospective in vitro and in vivo validation. Using sleep-related phenotypic readouts describing both efficacy and side effects for 491 compounds tested in rat, we defined an "optimal" (desirable) sleeping pattern. Compounds were subjected to in silico target prediction (which was experimentally confirmed for 21 out of 28 cases), followed by the utilization of decision trees for deriving polypharmacological bioactivity profiles.
View Article and Find Full Text PDFGenetic factors modifying the blood metabolome have been investigated through genome-wide association studies (GWAS) of common genetic variants and through exome sequencing. We conducted a whole-genome sequencing study of common, low-frequency and rare variants to associate genetic variations with blood metabolite levels using comprehensive metabolite profiling in 1,960 adults. We focused the analysis on 644 metabolites with consistent levels across three longitudinal data collections.
View Article and Find Full Text PDFWe report on the sequencing of 10,545 human genomes at 30×-40× coverage with an emphasis on quality metrics and novel variant and sequence discovery. We find that 84% of an individual human genome can be sequenced confidently. This high-confidence region includes 91.
View Article and Find Full Text PDFThe increase of publicly available bioactivity data has led to the extensive development and usage of in silico bioactivity prediction algorithms. A particularly popular approach for such analyses is the multiclass Naïve Bayes, whose output is commonly processed by applying empirically-derived likelihood score thresholds. In this work, we describe a systematic way for deriving score cut-offs on a per-protein target basis and compare their performance with global thresholds on a large scale using both 5-fold cross-validation (ChEMBL 14, 189k ligand-protein pairs over 477 protein targets) and external validation (WOMBAT, 63k pairs, 421 targets).
View Article and Find Full Text PDFBackground: An in silico mechanism-of-action analysis protocol was developed, comprising molecule bioactivity profiling, annotation of predicted targets with pathways and calculation of enrichment factors to highlight targets and pathways more likely to be implicated in the studied phenotype.
Results: The method was applied to a cytotoxicity phenotypic endpoint, with enriched targets/pathways found to be statistically significant when compared with 100 random datasets. Application on a smaller apoptotic set (10 molecules) did not allowed to obtain statistically relevant results, suggesting that the protocol requires modification such as analysis of the most frequently predicted targets/annotated pathways.
Integrating gene expression profiles with certain proteins can improve our understanding of the fundamental mechanisms in protein-ligand binding. This paper spotlights the integration of gene expression data and target prediction scores, providing insight into mechanism of action (MoA). Compounds are clustered based upon the similarity of their predicted protein targets and each cluster is linked to gene sets using Linear Models for Microarray Data.
View Article and Find Full Text PDFThe simultaneous increase of computational power and the availability of chemical and biological data have contributed to the recent popularity of in silico bioactivity prediction algorithms. Such methods are commonly used to infer the 'Mechanism of Action' of small molecules and they can also be employed in cases where full bioactivity profiles have not been established experimentally. However, protein target predictions by themselves do not necessarily capture information about the effect of a compound on a biological system, and hence merging their output with a systems biology approach can help to better understand the complex network modulation which leads to a particular phenotype.
View Article and Find Full Text PDFNeurofibrillary tangles, one of the hallmarks of Alzheimer disease (AD), are composed of paired helical filaments of abnormally hyperphosphorylated tau. The accumulation of these proteinaceous aggregates in AD correlates with synaptic loss and severity of dementia. Identifying the kinases involved in the pathological phosphorylation of tau may identify novel targets for AD.
View Article and Find Full Text PDFThe ChEMBLSpace graphical explorer enables the identification of compounds from the ChEMBL database, which exhibit a desirable polypharmacology profile. This profile can be predefined or created iteratively, and the tool can be extended to other data sources.
View Article and Find Full Text PDFA continuing problem in protein-ligand docking is the correct relative ranking of active molecules versus inactives. Using the ChemScore scoring function as implemented in the GOLD docking software, we have investigated the effect of scaling hydrogen bond, metal-ligand, and lipophilic interactions based on the buriedness of the interaction. Buriedness was measured using the receptor density, the number of protein heavy atoms within 8.
View Article and Find Full Text PDFThe use of structure-based virtual screening to predict small-molecule binding in a target active site is an increasingly popular approach in drug discovery programs. As the number of structures of protein-ligand complexes in public and proprietary databases grow, it is important to incorporate prior structural knowledge of ligand binding into virtual screening experiments. The structural interaction fingerprint (SIFt) approach aims to capture a 1D representation of the interactions between ligand and protein either in complexes of known structure or in docked poses.
View Article and Find Full Text PDFDomains within the multienzyme polyketide synthases are linked by noncatalytic sequences of variable length and unknown function. Recently, the crystal structure was reported of a portion of the linker between the acyltransferase (AT) and ketoreductase (KR) domains from module 1 of the erythromycin synthase (6-deoxyerythronolide B synthase), as a pseudodimer with the adjacent ketoreductase (KR). On the basis of this structure, the homologous linker region between the dehydratase (DH) and enoyl reductase (ER) domains in fully reducing modules has been proposed to occupy a position on the periphery of the polyketide synthases complex, as in porcine fatty acid synthase.
View Article and Find Full Text PDFA procedure for analyzing and classifying publicly available crystal structures has been developed. It has been used to identify high-resolution protein-ligand complexes that can be assessed by reconstructing the electron density for the ligand using the deposited structure factors. The complexes have been clustered according to the protein sequences, and clusters have been discarded if they do not represent proteins thought to be of direct interest to the pharmaceutical or agrochemical industry.
View Article and Find Full Text PDFImpressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding.
View Article and Find Full Text PDFDNA-dependent protein kinase (DNA-PK) is part of the eukaryotic DNA double strand break repair pathway and as such is crucial for maintenance of genomic stability, as well as for V(D)J (variable-diversity-joining) recombination. The catalytic subunit of DNA-PK (DNA-PKcs) belongs to the phosphatidylinositol-3 (PI-3) kinase-like kinase (PIKK) superfamily and is comprised of approximately 4100 amino acids. We have used a novel repeat detection method to analyse this enormous protein and have identified two different types of helical repeat motifs in the N-terminal region of the sequence, as well as other previously unreported features in this repeat region.
View Article and Find Full Text PDFCellular life depends upon the preservation and transmission of genetic material. Double stranded DNA breaks (DSBs) cause catastrophic gene loss in cell division and must be promptly and accurately repaired. In eukaryotes DSBs may be repaired by either non-homologous end-joining (NHEJ), single strand annealing or homologous recombination (HR).
View Article and Find Full Text PDF