An algebraic geometry approach to protein structure determination from NMR data.

Proc IEEE Comput Syst Bioinform Conf

Dartmouth Computer Science Department, Hanover, NH 03755, USA.

Published: May 2007

Our paper describes the first provably-efficient algorithm for determining protein structures de novo, solely from experimental data. We show how the global nature of a certain kind of NMR data provides quantifiable complexity-theoretic benefits, allowing us to classify our algorithm as running in polynomial time. While our algorithm uses NMR data as input, it is the first polynomial-time algorithm to compute high-resolution structures de novo using any experimentally-recorded data, from either NMR spectroscopy or X-Ray crystallography. Improved algorithms for protein structure determination are needed, because currently, the process is expensive and time-consuming. For example, an area of intense research in NMR methodology is automated assignment of nuclear Overhauser effect (NOE) restraints, in which structure determination sits in a tight inner-loop (cycle) of assignment/refinement. These algorithms are very time-consuming, and typically require a large cluster. Thus, algorithms for protein structure determination that are known to run in polynomial time and provide guarantees on solution accuracy are likely to have great impact in the long-term. Methods stemming from a technique called "distance geometry embedding" do come with provable guarantees, but the NP-hardness of these problem formulations implies that in the worst case these techniques cannot run in polynomial time. We are able to avoid the NP-hardness by (a) some mild assumptions about the protein being studied, (b) the use of residual dipolar couplings (RDCs) instead of a dense network of NOEs, and (c) novel algorithms and proofs that exploit the biophysical geometry of (a) and (b), drawing on a variety of computer science, computational geometry, and computational algebra techniques. In our algorithm, RDC data, which gives global restraints on the orientation of internuclear bond vectors, is used in conjunction with very sparse NOE data to obtain a polynomial-time algorithm for protein structure determination. An implementation of our algorithm has been applied to 6 different real biological NMR data sets recorded for 3 proteins. Our algorithm is combinatorially precise, polynomial-time, and uses much less NMR data to produce results that are as good or better than previous approaches in terms of accuracy of the computed structure as well as running time. In practice approaches such as restrained molecular dynamics and simulated annealing, which lack both combinatorial precision and guarantees on running time and solution quality, are commonly used. Our results show that by using a different "slice" of the data, an algorithm that is polynomial time and that has guarantees about solution quality can be obtained. We believe that our techniques can be extended and generalized for other structure-determination problems such as computing side-chain conformations and the structure of nucleic acids from experimental data.

Download full-text PDF

Source
http://dx.doi.org/10.1109/csb.2005.11DOI Listing

Publication Analysis

Top Keywords

structure determination
20
nmr data
20
protein structure
16
polynomial time
16
data
11
algorithm
9
structures novo
8
experimental data
8
data global
8
polynomial-time algorithm
8

Similar Publications

Structural Transformation and Degradation of Cu Oxide Nanocatalysts during Electrochemical CO Reduction.

J Am Chem Soc

January 2025

Liquid Sunlight Alliance, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, United States.

The electrochemical CO reduction reaction (CORR) holds enormous potential as a carbon-neutral route to the sustainable production of fuels and platform chemicals. The durability for long-term operation is currently inadequate for commercialization, however, and the underlying deactivation process remains elusive. A fundamental understanding of the degradation mechanism of electrocatalysts, which can dictate the overall device performance, is needed.

View Article and Find Full Text PDF

Background: This mixed methods study identified needed refinements to a telehealth-delivered cultural and linguistic adaptation of Meaning-Centered Psychotherapy for Chinese patients with advanced cancer (MCP-Ch) to enhance acceptability, comprehensibility, and implementation of the intervention in usual care settings, guided by the Ecological Validity Model (EVM) and the Practical, Robust Implementation and Sustainability Model (PRISM).

Methods: Fifteen purposively sampled mental health professionals who work with Chinese cancer patients completed surveys providing Likert-scale ratings on acceptability and comprehensibility of MCP-Ch content (guided by the EVM) and pre-implementation factors (guided by PRISM), followed by semi-structured interviews. Survey data were descriptively summarized and linked to qualitative interview data.

View Article and Find Full Text PDF

A rare haplotype of the GJD3 gene segregating in familial Meniere's disease interferes with connexin assembly.

Genome Med

January 2025

Otology & Neurotology Group CTS495, Instituto de Investigación Biosanitario, Ibs.GRANADA, Universidad de Granada, 18071, Granada, Spain.

Background: Familial Meniere's disease (FMD) is a rare polygenic disorder of the inner ear. Mutations in the connexin gene family, which encodes gap junction proteins, can also cause hearing loss, but their role in FMD is largely unknown.

Methods: We retrieved exome sequencing data from 94 individuals in 70 Meniere's disease (MD) families.

View Article and Find Full Text PDF

Context-dependent similarity analysis of analogue series for structure-activity relationship transfer based on a concept from natural language processing.

J Cheminform

January 2025

Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, University of Bonn, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany.

Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure-activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing.

View Article and Find Full Text PDF

Lipid-encapsulated gold nanoparticles: an advanced strategy for attenuating the inflammatory response in SARS-CoV-2 infection.

J Nanobiotechnology

January 2025

Graduate School of Biotechnology, and College of Life Science, Kyung Hee University, Yongin-Si, 17104, Gyeonggi-Do, Republic of Korea.

Background: Nanodrugs play a crucial role in biomedical applications by enhancing drug delivery. To address safety and toxicity concerns associated with nanoparticles, lipid-nanocarrier-based drug delivery systems have emerged as a promising approach for developing next-generation smart nanomedicines. Ginseng has traditionally been used for various therapeutic purposes, including antiviral activity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!