A novel approach is developed to address the challenge of annotating with phenotypic effects those exome variants for which relevant empirical data are lacking or minimal. The predictive annotation method is implemented as a stacked ensemble of supervised base-learners, including distributed random forest and gradient boosting machines. Ensemble models were trained and cross-validated on evidence-based categorical variant effect annotations from the ClinVar database, and were applied to 84 million non-synonymous single nucleotide variants (SNVs).
View Article and Find Full Text PDFBackground: Adoptive immunotherapy offers great potential for treating many types of cancer but its clinical application is hampered by cross-reactive T cell responses in healthy human tissues, representing serious safety risks for patients. We previously developed a computational tool called Expitope for assessing cross-reactivity (CR) of antigens based on tissue-specific gene expression. However, transcript abundance only indirectly indicates protein expression.
View Article and Find Full Text PDFMotivation: Cross-reactivity (CR) or invocation of autoimmune side effects in various tissues has important safety implications in adoptive immunotherapy directed against selected antigens. The ability to predict CR (on-target and off-target toxicities) may help in the early selection of safer therapeutically relevant target antigens.
Results: We developed a methodology for the calculation of quantitative CR for any defined peptide epitope.
BMC Bioinformatics
February 2014
Background: Simple peak-picking algorithms, such as those based on lineshape fitting, perform well when peaks are completely resolved in multidimensional NMR spectra, but often produce wrong intensities and frequencies for overlapping peak clusters. For example, NOESY-type spectra have considerable overlaps leading to significant peak-picking intensity errors, which can result in erroneous structural restraints. Precise frequencies are critical for unambiguous resonance assignments.
View Article and Find Full Text PDFAdequate digital resolution and signal sensitivity are two critical factors for protein structure determinations by solution NMR spectroscopy. The prime objective for obtaining high digital resolution is to resolve peak overlap, especially in NOESY spectra with thousands of signals where the signal analysis needs to be performed on a large scale. Achieving maximum digital resolution is usually limited by the practically available measurement time.
View Article and Find Full Text PDFComput Biol Chem
October 2013
The quality of protein structures obtained by different experimental and ab-initio calculation methods varies considerably. The methods have been evolving over time by improving both experimental designs and computational techniques, and since the primary aim of these developments is the procurement of reliable and high-quality data, better techniques resulted on average in an evolution toward higher quality structures in the Protein Data Bank (PDB). Each method leaves a specific quantitative and qualitative "trace" in the PDB entry.
View Article and Find Full Text PDFFaster than death: NMR techniques that make use of nonlinear sampling and hyperdimensional processing enable the recording of complete NMR data sets for the automated assignment of the backbone and side-chain resonances of short-lived protein samples of cell lysates.
View Article and Find Full Text PDFModified ubiquitin sequences, each completed with a His tag and a TEV cleavage site, were designed to enhance the expression of protein/peptide targets. With this new system we have been able to characterize several peptide-protein interactions by ITC and by NMR and CD spectroscopic methods, including the interactions of LIR domains with autophagy modifiers.
View Article and Find Full Text PDFThe protocols currently used for protein structure determination by nuclear magnetic resonance (NMR) depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether it is indeed possible to generate in a fully automated manner NMR structures adequate for deposition in the Protein Data Bank, we gathered 10 experimental data sets with unassigned nuclear Overhauser effect spectroscopy (NOESY) peak lists for various proteins of unknown structure, computed structures for each of them using different, fully automatic programs, and compared the results to each other and to the manually solved reference structures that were not available at the time the data were provided.
View Article and Find Full Text PDFLarge-scale initiatives for obtaining spatial protein structures by experimental or computational means have accentuated the need for the critical assessment of protein structure determination and prediction methods. These include blind test projects such as the critical assessment of protein structure prediction (CASP) and the critical assessment of protein structure determination by nuclear magnetic resonance (CASD-NMR). An important aim is to establish structure validation criteria that can reliably assess the accuracy of a new protein structure.
View Article and Find Full Text PDFProg Nucl Magn Reson Spectrosc
October 2011
The necessity to acquire large multidimensional datasets, a basis for assignment of NMR resonances, results in long data acquisition times during which substantial degradation of a protein sample might occur. Here we propose a method applicable for such a protein for automatic assignment of backbone resonances by direct inspection of multidimensional NMR spectra. In order to establish an optimal balance between completeness of resonance assignment and losses of cross-peaks due to dynamic processes/degradation of protein, assignment of backbone resonances is set as a stirring criterion for dynamically controlled targeted nonlinear NMR data acquisition.
View Article and Find Full Text PDFAn approach is described for joint interleaved recording, real-time processing, and analysis of NMR data sets. The method employs multidimensional decomposition to find common information in a set of conventional triple-resonance spectra recorded in the nonlinear sampling mode, and builds a model of hyperdimensional (HD) spectrum. While preserving sensitivity per unit of measurement time and allowing for maximal spectral resolution, the approach reduces data collection time on average by 2 orders of magnitude compared to the conventional method.
View Article and Find Full Text PDFA target-oriented approach for the acquisition of information in biomolecular NMR spectroscopy is being developed. This approach combines concurrent data accumulation, processing, and monitoring of spectral quality. Real-time estimation of parameters allows acquisition to be stopped when results are complete and have a specified precision.
View Article and Find Full Text PDFWe introduce the recursive multidimensional decomposition (R-MDD) method to speed recording of high-resolution NMR spectra. The measurement time is logarithmically dependent on the sizes of indirect spectral dimensions. R-MDD has the sensitivity and resolution advantages of optimized nonuniform acquisition schemes and is applicable to all types of biomolecular spectra.
View Article and Find Full Text PDFHighly resolved multi-dimensional NOE data are essential for rapid and accurate determination of spatial protein structures such as in structural genomics projects. Four-dimensional spectra contain almost no spectral overlap inherently present in lower dimensionality spectra and are highly amenable to application of automated routines for spectral resonance location and assignment. However, a high resolution 4D data set using conventional uniform sampling usually requires unacceptably long measurement time.
View Article and Find Full Text PDFThe effect of hydrogen/deuterium exchange on protein hydrogen bond coupling constants (h3)J(NC') has been investigated in the small globular protein ubiquitin. The couplings across deuterated or protonated hydrogen bonds were measured by a long-range quantitative HA(CACO)NCO experiment. The analysis is combined with a determination of the H(N)/D(N) isotope effect on the amide group (1)J(NC') couplings and the (15)N and (13)C' chemical shifts.
View Article and Find Full Text PDFBased on the (1)H-(15)N NMR spectroscopy data, the three-dimensional structure and internal dynamic properties of ribosomal protein L7 from Escherichia coli were derived. The structure of L7 dimer in solution can be described as a set of three distinct domains, tumbling rather independently and linked via flexible hinge regions. The dimeric N-terminal domain (residues 1-32) consists of two antiparallel alpha-alpha-hairpins forming a symmetrical four-helical bundle, whereas the two identical C-terminal domains (residues 52-120) adopt a compact alpha/beta-fold.
View Article and Find Full Text PDFHelix formation of an S-peptide analog, comprising the first 20 residues of Ribonuclease A and two additional N-terminal residues, was studied by measuring hydrogen bond (H-bond) (h3)J(NC') scalar couplings as a function of 2,2,2-trifluoroethanol (TFE) concentration. The (h3)J(NC') couplings give direct evidence for the closing of individual backbone N-H***O = C H-bonds during the TFE-induced formation of secondary structure. Whereas no (h3)J(NC') correlations could be detected without TFE, alpha-helical (i,i +4) H-bond correlations were observed for the amides of residues A5 to M15 in the presence of TFE.
View Article and Find Full Text PDFWe previously reported that a helical trigger segment within the GCN4 leucine zipper monomer is indispensable for the formation of its parallel two-stranded coiled coil. Here, we demonstrate that the intrinsic secondary structure of the trigger site is largely stabilized by an intrahelical salt bridge. Removal of this surface salt bridge by a single amino acid mutation induced only minor changes in the backbone structure of the GCN4 leucine zipper dimer as verified by nuclear magnetic resonance.
View Article and Find Full Text PDFNative state hydrogen exchange of cold shock protein A (CspA) has been characterized as a function of the denaturant urea and of the stabilizing agent trimethylamine N-oxide (TMAO). The structure of CspA has five strands of beta-sheet. Strands beta1-beta4 have strongly protected amide protons that, based on experiments as a function of urea, exchange through a simple all-or-none global unfolding mechanism.
View Article and Find Full Text PDFThe acid-denatured form of the protein LysN aggregates reversibly at pH 2.0. The strength of self-association increases with increasing Cl(-) anion concentration.
View Article and Find Full Text PDFThe structure of LysN contains an OB-fold motif composed of a structurally conserved five-stranded beta-barrel capped by a poorly conserved alpha-helix between strands beta3 and beta4. Two additional alpha-helices, unique to the LysN structure, flank the N terminus of the OB-fold. The stability of LysN to unfolding has been investigated with NMR native state hydrogen exchange measurements as a function of guanidinium hydrochloride concentration, and equilibrium unfolding transitions monitored by ellipticity at 222 nm and fluorescence at 350 nm.
View Article and Find Full Text PDFWe present a new algorithm for characterization of protein spatial structure basing on the molecular hydrophobicity potential approach. The method is illustrated by the analysis of three-dimensional structure of barnase and barnase-barstar complex. Current approach enables identification of amino acid residues situated in unfavorable environment (these residues may be "active" for binding), and to map quantitatively hydrophobic, hydrophilic and unfavorable hydrophobic-hydrophilic intra- and inter-molecular contacts involving backbone and side-chain segments of amino acid residues.
View Article and Find Full Text PDFA 600 MHz 1H NMR study of toxin OSK1, blocker of small-conductance Ca2+-activated K+ channels, is presented. The unambiguous sequential assignment of all the protons of the toxin was obtained using TOCSY, DQF-COSY, and NOESY experiments at pH 3.0 (10, 30, and 45 degrees C) in aqueous solution.
View Article and Find Full Text PDF