Correlation-Based Feature Selection to Identify Functional Dynamics in Proteins.

J Chem Theory Comput

Biomolecular Dynamics, Institute of Physics, Albert-Ludwigs-Universität, 79104 Freiburg, Germany.

Published: August 2022

To interpret molecular dynamics simulations of biomolecular systems, systematic dimensionality reduction methods are commonly employed. Among others, this includes principal component analysis (PCA) and time-lagged independent component analysis (TICA), which aim to maximize the variance and the time scale of the first components, respectively. A crucial first step of such an analysis is the identification of suitable and relevant input coordinates (the so-called features), such as backbone dihedral angles and interresidue distances. As typically only a small subset of those coordinates is involved in a specific biomolecular process, it is important to discard the remaining uncorrelated motions or weakly correlated noise coordinates. This is because they may exhibit large amplitudes or long time scales and therefore will be erroneously considered important by PCA and TICA, respectively. To discriminate collective motions underlying functional dynamics from uncorrelated motions, the correlation matrix of the input coordinates is block-diagonalized by a clustering method. This strategy avoids possible bias due to presumed functional observables and conformational states or variation principles that maximize variance or time scales. Considering several linear and nonlinear correlation measures and various clustering algorithms, it is shown that the combination of linear correlation and the Leiden community detection algorithm yields excellent results for all considered model systems. These include the functional motion of T4 lysozyme to demonstrate the successful identification of collective motion, as well as the folding of the villin headpiece to highlight the physical interpretation of the correlated motions in terms of a functional mechanism.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.2c00337DOI Listing

Publication Analysis

Top Keywords

functional dynamics
8
component analysis
8
maximize variance
8
variance time
8
input coordinates
8
uncorrelated motions
8
time scales
8
functional
5
correlation-based feature
4
feature selection
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!