Of course, these studies were limited and involved just a few proteins, because getting experimental data about dynamics is costly and time-consuming. For comparisons across large numbers of different proteins, computational approaches may therefore be of great value. Previously, other groups have made use of short molecular dynamics simulations or normal mode analysis. Raimondi et al. continue in this vein, combining normal-mode analysis of single structures with principal component analysis of a large set of structures from the Ras superfamily of proteins.
The Ras superfamily encompasses several groups of related folds with nucleotide-dependent activity. When GTP is bound to them, they are active and propagate a particular signal. Over time, the GTP gets hydrolyzed to GDP and the signal turns off. This catalytic process is pretty inefficient, but it can be enhanced by the action of a GTPase Activating Protein (GAP). The exchange of GDP for GTP can be enhanced by the action of a Guanine nucleotide Exchange Factor (GEF). The GTP/GDP state manifests primarily in the positioning of two loops, termed the switch regions (SwI and SwII). This mechanism allows for several different modes of control, so the Ras architecture has been repurposed many times throughout evolution for a variety of different roles.
Because the different members of the superfamily play key roles in their respective pathways, there are many structures available, often in several different states (GTP-bound, GDP-bound, GEF-bound, etc.). Raimondi et al. aligned these structures using the common features of the Ras fold and used PCA to identify flexibility across this evolutionary ensemble. The goal of PCA is to take a dataset with many potentially correlated data points (in this case, the relative positions of the backbone Cα atoms) and identify a small set of variables that explain as much of the variance as possible. Here, the principal components (PC) are expected to describe the structural variability of the fold.
The first PC, which is expected to explain the largest amount of the variability, can separate the structures by their families. That is, the displacement along PC1 can distinguish a Rho family domain from an Arf family domain. The authors call this variability function-independent, because this principal component doesn’t seem to make any meaningful distinction between the GTP/active and GDP/inactive states. That appears to be a property of the second PC, which for some families does a very good job of separating the GTP from the GDP-bound forms (for others there appears to be more mixing). According to this analysis, function-dependent variability appears to be confined to one half of the protein, while function-independent variability seems to be distributed across the whole fold.
The authors also performed normal mode analysis on individual proteins from the Ras superfamily using an elastic network model. In this kind of simulation the protein is modeled as a group of Cα “nodes” connected by spring-like harmonic potentials representing covalent and non-covalent interactions. Although any one of these “bonds” can be stretched, compressed, and moved, such deformations exert a force on other bonds connected to the nodes involved, which tends to damp most motions. Certain collective deformations will be favored as a result, and these can be calculated as “normal modes” that probably reflect slow fluctuations of the fold.
The deformations detected by ENM for all individual proteins overlapped significantly with the second PC identified in the evolutionary analysis. That is, the conformational variability of a conserved domain over evolutionary time is correlated with the conformational fluctuations of a single domain on a biological time scale. This makes sense, especially in this case, because the switch regions are areas of significant conformational variability, and are connected with the conserved catalytic function of these proteins. The fact that PC1 doesn’t line up with the low-frequency normal modes probably means that the conformational transitions between different family members cannot be mimicked by ordinary thermal motion, i.e. the fold cannot change this way without the aid of mutations.
Although the results in these studies might seem rather pedestrian and expected, I find them quite encouraging. We’re not particularly good at predicting structure from sequence yet, and our understanding of protein dynamics is even more primitive. What these studies indicate is that it should be possible to predict the conformational fluctuations of a given protein or domain using our knowledge of a related, homologous protein. This could have positive consequences for fields such as rational drug design and protein design, which have met with limited success in part, perhaps, because they do not sufficiently account for a protein’s structural fluctuations.
(1) Raimondi, F., Orozco, M., & Fanelli, F. (2010). Deciphering the Deformation Modes Associated with Function Retention and Specialization in Members of the Ras Superfamily. Structure, 18 (3), 402-414 DOI: 10.1016/j.str.2009.12.015
(2) Law, A., Fuentes, E., & Lee, A. (2009). Conservation of Side-Chain Dynamics Within a Protein Family. Journal of the American Chemical Society, 131 (18), 6322-6323 DOI: 10.1021/ja809915a