Nov 272009
It surprises me how often I hear students, postdocs, and even professors talk about determining the structure of a protein. A singular structure has the advantage of being relatively easy to interpret, but the cost of this is often the loss of functional data. It’s easy to understand how this terminology emerges from the discipline of crystallography, which after all only works when the protein molecules adopt only a small number of conformations. Yet even when it comes to NMR, a technique that should be very sensitive to the fact of structural multiplicity, the language of researchers and the structural tools available to them are too often oriented towards the idea of a singular structure. But any representation of a protein as a single conformation is a simplification — every protein exists in multiple structural states.

Trivially, we are aware that a given polypeptide chain can adopt a number of different conformations — the “folded state” of any given polypeptide chain covers only a tiny sliver of the possible conformational space. A protein that is “unfolded” occupies not a single, well-defined state but a vast multiplicity of states, and this kind of statement is not controversial because we tend to imagine unfoldedness as a messy chaotic jumble of conformations. The reality is less cut-and-dried: although unfolded proteins may have no regular structure, many still have a propensity to form particular secondary structures or interactions. The reality of denatured proteins is that they have a complex and varied energy landscape, not an array of possible structures that all have roughly equivalent energy. The flipside of the popular view is that the a protein’s native state draws down to a sharp energy well, and this conception is also misguided.

The most dramatic counterexamples to the idea of a neat, punctate energy well come from proteins that adopt several different folds in the native state. One relevant case is lymphotactin, which freely interconverts between an α/β monomer and an all-β dimer under physiological conditions. Lymphotactin may be unusual, but the principal message from that study is one that ought be paid attention to in others, particularly when the protein in question has functional conformational diversity. Consider α-synuclein, a protein implicated in Parkinson’s disease. In the presence of some detergent micelles this protein is known to take on an α-helical hairpin structure, with two helices laying down on the charged surface of the lipid headgroup. In solution, however, it seems to take on a number of different forms, and may interact with true lipid bilayers in a completely different way than it interacts with micelles. For proteins that interconvert between several different physiologically-relevant folds, one is never pursuing the structure, but rather a structure.

Of course, we don’t expect most proteins or domains to regularly adopt alternate overall folds. However, reorientations of domains or monomers is a relatively common behavior, and one that poses a sticky challenge for structural biologists because incidental properties of a particular arrangement may bias our experiments towards observing it. A minor member of the ensemble, if it has favorable packing geometry, may exclusively populate a crystal. Similarly, NMR experiments to determine domain arrangement via residual dipolar couplings must always be undertaken with an eye to ensuring that interactions with the aligning media do not bias the results. No single structure of adenylate kinase can instruct us about its catalytic cycle, and structures of the unbound state do not capture the reality that the protein continues to open and close in the absence of ligand. Single structures do not capture motions of domains or monomers relative to each other and that often means an incomplete understanding of function.

Domain motions are also an overly dramatic example, because simpler rearrangements of the backbone take place in many proteins, even when regular secondary structures are evident. Fluctuations of the main chain play a functional role in several proteins — as, for instance, in the flaps of the HIV protease. Additionally, rearrangements of the backbone have a significant role in signaling, as in NtrC, which I’ll talk about more in two weeks. Proteins where the main chain rearranges in response to ligand binding or post-translational modification generally cannot be described by a single structure.

Even if the backbone is rigid, every protein will have flexibility in the side chains of its amino acids. One of course expects to see this kind of behavior in side chains on the surface of a protein, where it is usually dismissed as irrelevant. However, we also know that side chains can rotate and move in the core of a protein, and that on some protein surfaces they can undergo coherent rearrangements. I’ll talk a bit more about the functional relevance of side-chain motions next Thursday. For now, suffice to say that side chain rotations cannot be so easily ignored and sometimes have functional effects. Structural studies that do not capture these rotations may be missing something important.

My point here is not that single structures are stupid or useless. A structure can be very informative about about a protein’s function, and often has great power to explain the effects of mutations and ligands. However, we should not mislead ourselves into thinking that any single structure will have all the answers, or indeed any of them. Every protein is a constantly interconverting ensemble of structures, and there are many layers of structural diversity within that ensemble, reaching from whole fold rearrangements to “mere” side-chain adjustments. Determining the structure of a protein is not a coherent goal for a research program. The successful structural biology study will characterize the conformation and energy of key, functionally-relevant members of the protein’s structural ensemble and identify the pathways between them.

 Posted by at 9:00 PM
Aug 222007

I had a conversation this week with Annette about the structural ensemble versus individual structures that I’m still trying to coalesce into a fully-formed idea. The kernel of it is this: there is a dichotomy between the way we know that proteins act and the way we talk about their action. Proteins give rise to phenomenological effects as ensembles, but we discuss their states as individuals.

Consider a signaling molecule, say a member of a MAP kinase cascade (picture at right). A given protein can exist in either an inactive (A) or active (B) form. When active, the kinase phosphorylates some downstream target, otherwise it just sits in your cytoplasm taking a nap. Typically in this kind of system the active form of the kinase is also the phosphorylated form (red B). It’s typical to say something like, “The kinase is activated by phosphorylation.” At the same time we know from some of Dorothee’s work with Dave Wemmer that certain bacterial proteins that get phosphorylated already sample their active conformations even before they are modified (blue B).

Even for systems where this kind of sampling hasn’t been directly demonstrated it’s reasonable to assume it takes place. After all, the active and inactive structures have the same amino acids to work with. Unless the phosphate group itself is a lynchpin of the new structure (perhaps by bridging two structural elements), then the active structure must be one the unmodified kinase can adopt. Naturally, we expect this structure to be higher in energy than the inactive state (so blue B is higher on our energy diagram than blue A), and that phosphorylation decreases the energy of the active state so that it is subsequently preferred (so red B is lower than blue A).

The implication of this is that, unless the unmodified active structure is much higher in energy than the inactive structure, some proportion of our kinase is active even when not phosphorylated. Perhaps this is as low as 1-2%, a fraction that’s difficult to detect directly. Still, because enzymes are so efficient, this quantity may be significant. Or, for a single protein, we could say that it adopts an active form without phosphorylation 1-2% of the time. But we tend to talk about phosphorylation and other post-translational modifications as if they were switches, with phrases like “protein X is turned on by phosphorylation”. The reality, though, is that the switch is less a matter of turning a protein on than of turning it on more.

This points to a reality far less clean and orderly than typically depicted in block schematics. Inappropriately active (i.e. active without modification) members of the various protein ensembles must give rise to a considerable amount of noise in biological information processes. The system must therefore have some way to distinguish the signal from the noise that’s more than just the binary on/off typically depicted and discussed. These filters could take several forms — for instance, the kinase of our kinase may mediate the interaction between our kinase and its target, though in this case inappropriate activation of the MAPKK could still give rise to signaling noise. Alternately, the phosphate could mediate the kinase – target interaction. Or the cell could simply have an inefficient signaling system, so that multiple nearly-simultaneous signaling events are necessary to activate a response.

Is this point important? Maybe and maybe not. Most of our experiments can only access the behavior of ensembles, so the ensemble nature of protein action is not likely to lead us astray. But as single-molecule studies become more popular it may be important to keep the ensemble perspective in mind so as not to be confused by their results. Moreover, a conceptually accurate picture of cellular signaling and regulation will require us to keep this feature in mind.

 Posted by at 9:01 PM
Aug 132007

Another feature of the protein society was a continued emphasis on trying to understand natively-disordered proteins, and by extension, the denatured state of natively ordered proteins. Because these two fields are highly related and use the same techniques, it seems to me best to lump them together for now. A couple of interesting points came up that I wanted to get down here for my own memory’s sake.

One point, and one that became a recurring theme in several talks at the symposium, was averaging bias. The first real discussion of this came from a really good talk by Michele Vendruscolo on the study of the natively-disordered 131-deletion mutant of staphylococcal nuclease. Some models that Dave Shortle had produced of the disordered state on the basis of paramagnetic relaxation enhancement had predicted ensembles that were too small with respect to the known radius of gyration. Michele pointed out that the PRE is an ensemble measurement, and many different ensembles can give rise to the same PRE. Additionally, the PRE is biased because below a certain threshold the effect is invisible. This means that the measurement ends up being biased towards closer approaches. Essentially his point was that the normal distribution cannot be assumed for the ensemble average of distance measurements in the denatured state (and it’s probably a questionable assumption in the native state as well).Kevin Plaxco gave a talk later on that really hammered this point home. He did a series of SAXS experiments to determine the radius of gyration for a ton of proteins, including several that had shown residual structure in NMR experiments. His results indicated that the experimentally determined radius of gyration matched that predicted for a random coil for all these proteins. As he pointed out, though, the Rg is totally insensitive to local structure, whereas because of anomalous averaging much of the NMR data is hypersensitive to local structure. This means that both results can be right — any given protein can have some percentage of its structure intact and as long as it’s a different piece for each protein and not too much, the ensemble can retain a random-coil-like Rg. If tertiary interactions are preserved this becomes a slightly more difficult proposition to swallow, though. Still, his work, and several other talks and posters presented during the symposium, made an excellent point. We simply cannot rely on the assumption of a normal distribution when we are analyzing NMR data from systems with so many degrees of freedom.

Another thread that showed up repeatedly was the ongoing attempt to understand exactly how disordered states interact and are regulated, especially by post-translational modifications such as phosphorylation. Most disordered regions have multiple binding partners, with affinity enhanced for a particular partner by a particular modification. In the simplest model for these interactions, the modification itself and some of the surrounding primary sequence is recognized. However, there’s an increasing amount of data, including a nice talk by a postdoc from Julie Forman-Kay’s group, that the post-translational modifications alter the structural characteristics of the disordered state itself. The Forman-Kay talk suggested that phosphorylation induced a condensation of the protein by attenuating a surplus of positive charge.

This could conceivably be taken further. Consider a bit of sequence like DKRSDKA, which could conceivably take the form of a β-strand if it weren’t for that concentration of positive charge on one side. A phosphate group on the serine could conceivably stabilize this structure and preorganize it for binding to a ligand, thus increasing affinity by reducing the energetic cost of binding.

It might even be possible to tune things more specifically. Take a sequence like GRDSSKAKSR. If you put this on a helix wheel you’ll see a huge blast of positive charge on one side, but also a pair of serines. Phosphorylate S5 and S9 and you could stabilize the helix. At the same time, this would make a β-strand conformation less likely because such a strand would have negative charges on one side and positive charges on the other. By contrast, if you phosphorylate S4 you’d do nothing to stabilize the unfavorable charge concentration on the helix, but the positive charge concentration on the strand would be attenuated (see cartoon). In this way phosphorylation might be used as a kind of conformational switch to preorganize the same sequence in different ways and thus reach different downstream effectors. We know that conformational rearrangements of the kind that lymphotactin undergoes give rise to different signals and protein behaviors. The role of differential preorganization in disordered proteins hasn’t been extensively studied yet, but may be equally important.

It’s increasingly clear that disordered regions are a major factor in cellular signaling. I’m not having much luck with the one I’m working on now, but I’m excited to see where the next few years lead this field.

 Posted by at 5:11 PM