Jan 042012
 

One of the most serious challenges facing medical science today is the development of drug resistance by bacteria and viruses. Almost as quickly as we can develop drugs that attack the machinery of infectious disease, evolution, aided in some cases by careless use, defeats our efforts. In some cases this is because the specific target of a drug changes in response to the challenge, as has happened in the evolution of resistance to rimantidine in influenza. Bacteria have an additional mechanism to attack our medicines, however, in the form of multidrug resistance genes. These proteins can recognize an array of toxic molecules, often using general properties, and expel them from the cell. As such, every single one of these genes can take out multiple medicines.

One of these multidrug resistance exporters is EmrE, a member of the small multidrug resistance (SMR) family of genes. EmrE is a proton-drug antiporter that pushes positively-charged polyaromatic molecules out of the cell while letting two protons in. The import of the protons provides the energy to expel the toxins against a concentration gradient. Today in Nature, a research group led by my friend Katie Henzler-Wildman published new details of EmrE’s mechanism and topology (1). Using NMR and fluorescence techniques, we show that EmrE does, or at least can, operate as an antiparallel, asymmetric dimer that exposes a single active site to alternating sides of the membrane by simultaneously switching the conformation of the monomers.

This was a long and difficult project, in which I played a small role. Unfortunately, multidrug resistance exporters like EmrE tend to be integrated into the bacterial membrane, which makes them challenging subjects for biophysical studies. In order to investigate proteins like this, we must reconstitute them in lipid environments that suitably mimic their natural setting, while maintaining sufficient purity and concentration to perform our experiments. The controversy over the effect of rimantidine on influenza’s M2 channel provides just one example of the difficulty of reliably recreating a membrane environment.

EmrE has also been embroiled in a controversy between structural biologists and biochemists. Although the minimal functional unit is agreed to be a dimer, biochemical studies have indicated that that the dimer is symmetric, and that the proteins have a parallel orientation in the membrane. That is, each EmrE protein has the same shape and is pointing the same way. Relatively low-resolution data from crystallography and electron microscopy, however, has suggested that the protein units are asymmetric and antiparallel. However, these studies were performed in lipid environments where the protein may not have been active, and at frozen temperatures far from physiological relevance. One would like to get a look at EmrE in a state where it is active and at a somewhat more reasonable temperature.

Caught in the Act

Solution NMR provides one way to achieve this goal. A protein can be embedded in a small bit of membrane, and allowed to tumble freely in an aqueous environment, allowing sufficient signal for us to make some kinds of measurements. Historically, micelles have been used for this, but multiple lines of evidence now suggest that they may produce artifacts due to the unnatural local curvature. Consequently, Katie and her student Emma worked out a system for observing EmrE in bicelles, which are small, flat-ish discs of lipids that still tumble freely enough to allow solution-state NMR measurements. They also established that EmrE in this bicelle preparation could still bind to tetraphenylphosphonium (TPP+), one of its ligands.

The NMR spectrum, however, was perplexing. As you can see in the HSQC to the right, the peaks in the spectrum are fairly spread out. That’s unusual for a protein composed entirely of α-helices, but because electron currents from the aromatic rings of TPP+ induce significant changes in chemical shift it’s still reasonable. What is more troubling, and perhaps less obvious, is that there are twice as many peaks in this spectrum as you would expect.

An HSQC of a 15N-labeled protein, in principle, shows one peak for every N-H in the protein, such that you get a two-dimensional spectrum showing the chemical shift of the nitrogen on one axis and its bonded proton on the other. This means there should be one peak per amino acid, except prolines. In addition, peaks usually appear for tryptophan indoles (they are at bottom left in the spectrum) and, depending on your setup, glutamine and asparagine side-chains. Other side-chains usually exchange with water too quickly to be seen. EmrE has about 100 residues, and the spectrum has about 200 peaks. This indicates that there are two different structures of EmrE in the sample.

We decided to ask whether EmrE switched between these two structures and how fast. Observing two peaks per residue, of roughly equal intensity, told us that if EmrE did change its structure, it was doing so slowly, at a rate of 10 times a second or less. So we used an experiment called ZZ-exchange, which is similar to the HSQC but includes a relatively long pause between determining the 15N chemical shift and the 1H chemical shift. If a significant proportion of the sample changes conformation during the pause, you will see a spectrum that includes all the HSQC peaks, as well as cross-peaks that have the 15N chemical shift of one structure and the 1H chemical shift of the other, producing a little rectangle of peaks. This ended up being tricky because the bicelle distorts the signals, but Katie, Greg DeKoster, and I managed to come up with a setup that got around this problem on the 800 at Brandeis, starting from a pulse sequence written by Art Palmer.

As you can see above, we observed cross peaks, shown in blue to differentiate them from the HSQC peaks. By varying the pause between chemical shift determinations, we were able to fit a rate of about 5 /s, which roughly correlates to a fluorescence fluctuation observed in previous experiments. In addition, experiments using paramagnetic relaxation enhancement agents show that the two states have different accessibility to water, suggesting that they represent a change in which side of the active site is open. This supports the conclusion that what we are seeing here is the fundamental conformational change inherent to EmrE’s function: the opening of the binding site to one side of the membrane, then the other.

The ABBA Model?

So now we have evidence for two distinct structures that interconvert during the export process. Two models can be consistent with these data, shown in the figure below. The most obvious possibility, consistent with almost all of the biochemical data, is that the structures represent two states of a symmetric, parallel dimer converting from an AA state to a BB state. Alternately, consistent with the crystallographic data, one could have an asymmetric, antiparallel dimer that exchanges from an AB state to a BA state.

Top: Symmetric, parallel AA-BB model. Bottom: Asymmetric antiparallel AB-BA model.

The NMR data support the second model in two ways. The first is that peaks for the two states have almost equal intensity, which can only be the case if both states are almost equal in free energy. This happens automatically in the case of the asymmetric dimer, because each dimer contains one of each conformation, making the exchange to the alternate state energy-neutral. In the case of a symmetric dimer, it requires that each individual conformation have the same energy, which is unlikely, but not impossible. Also, in the NMR data, regions of the protein that show the largest difference in chemical shift between the two states also show the most significant conformational differences in the crystal structure of the asymmetric dimer. Unfortunately, these lines of evidence are not enough to be sure about what we’re seeing.

Flash in the Pan

To get a better idea of EmrE’s topology, Katie and her team performed a number of FRET and crosslinking experiments to establish the relative orientation of dimers in the membrane. In bulk FRET experiments, they fluorescently labeled EmrE that was in liposomes, as shown in Figure 3. In the first experiment, EmrE was exposed to one label while in liposomes, then broken out into bicelles and exposed to another. For antiparallel proteins, excitation of the green dye should result in fluorescent output from the red dye, and this is exactly what was observed. Also, EmrE in liposomes was exposed to both dyes simultaneously, which should result in an observation of FRET for parallel but not antiparallel dimers. Some FRET was observed, but it wasn’t clear whether this was due to dye leaking into the liposomes.

Katie answered this question using single-molecule FRET. EmrE dimers with a single cysteine mutated into them were labeled with fluorescent dye and then examined on a slide to determine the efficiency of energy transfer. Because there is only one labeling site, a high-efficiency transfer would imply that both fluorophores were on the same side of the membrane, and thus a parallel topology. However, the observed efficiency suggested a distance of 50 Å between the fluorophores, more consistent with an antiparallel topology where the labeling sites are separated by the membrane.

In one final experiment, Katie used a molecule that covalently links a lysine side chain to a cysteine side chain. There is only one lysine in EmrE, and Katie created a mutant that has a single cysteine on the opposite side of the membrane. This distance is too great for the linker molecule to bridge, so in a parallel dimer no cross-linking should be observed. Instead, the experiment resulted in nearly complete cross-linking, supporting an antiparallel topology.

How an Antiporter Works

Cumulatively, these results strongly support the model shown below, where EmrE swaps two protons for a drug molecule using a conformational exchange between energetically-equivalent asymmetric, antiparallel dimer states that are open to different sides of the membrane. In this model, there is a single binding site, consistent with the biochemical data, in the context of an antiparallel, asymmetric dimer, consistent with previous structural data. Because EmrE binds TPP+ with high affinity under our conditions, and because the cysteine mutations made for the FRET experiments did not significantly change the NMR spectra, we can be confident that these experiments plausibly reproduce normal protein behaviors. However, some mutational studies indicate that EmrE functions as a parallel dimer in vivo, and further experiments are necessary to either reconcile these observations or determine where the errors originate.

Exchange between identical antiparallel, asymmetric structures allows EmrE to exchange two protons for one molecule of toxin.

In terms of the implications for fighting drug resistance in bacteria, this is an early step on a long road. EmrE is not the only drug exporter in bacteria, nor is it the most critical. It is also too soon to say whether the particular mechanism outlined here is general to the SMR family or a peculiarity of this single protein. However, these results give us confidence that the crystal structures are reliable (Katie’s group is currently is working on improving them), and that we can cleanly measure exchange rates to determine what effect drug candidates are having. The goal would be to develop accessory drugs that attack the exporters while a primary drug attacks the bacterium’s basic functions. A great deal more work is necessary before we reach that point, but this is one strategy that may allow us to defeat drug resistance, or at least prolong the usefulness of our current antibiotic arsenal.

(1) Morrison, E., DeKoster, G., Dutta, S., Vafabakhsh, R., Clarkson, M.W., Bahl, A., Kern, D., Ha, T., & Henzler-Wildman, K. (2011). Antiparallel EmrE exports drugs by exchanging between asymmetric structures Nature, 481 (7379), 45-50 DOI: 10.1038/nature10703

 Posted by at 10:10 PM
Sep 202011
 

One of the goals of computational biology is to predict the complete high-order structure of a protein from its amino acid sequence. Often reasonably good structures can be produced by modeling a new protein according to an already-known structure of a homologous protein, one with a similar sequence and presumably a similar structure. However, these structures can be inaccurate, and obviously this method will not work if no homologous structure is known.

Foldit is an online game developed by the research team of Dr. David Baker that attempts to address this problem by combining an automated structure prediction program called ROSETTA with input from human players who manually remodel structures to improve them. Even though most of the players have little or no advanced biochemical knowledge, Foldit has already had some striking results improving on computational models. An upcoming paper in Nature Structural & Molecular Biology (1) (PDF also available directly from the Baker lab) details some interesting new successes from the Foldit players.

Contrary to some reports, the Foldit players did not solve any mystery directly related to HIV, although their work may prove helpful in developing new drugs for AIDS. What the Foldit players actually did was to outperform many protein structure prediction algorithms in the CASP9 contest, and to play a key role in helping solve the structure of an unusual protease from a simian retrovirus.

M-PMV Protease

If you don’t recognize Mason-Pfizer Monkey Virus (M-PMV) as a cause of AIDS in humans, that’s because it isn’t. It causes acquired immune deficiency in macaques, however, and it has an unusual protease that may tell us useful things.

Crystal structure of inactive HIV-1 protease mutant in complex with substrate.

A crystal structure of an inactive mutant of HIV-1 protease in complex with its substrate. The protease monomers are in dark green and cyan, the substrate is represented as purple bonds.

Retroviruses like HIV often produce proteins in a fused form rather than as individual folded units. In order to be functional, the various proteins must be snipped out of these long polyprotein strands, so the virus includes a protease (protein-cutting enzyme) to do this. In most retroviruses, this protease is dimeric: it is composed of two protein molecules with identical sequences and similar, symmetric structures. The long-known structure of HIV protease, seen on the right (learn more about HIV protease or explore this structure at the Protein Data Bank) is an example of this architecture.

People infected with HIV often take protease inhibitors to interfere with viral replication. These drugs attack the active site, where the chemical reaction that cuts the protein strand takes place, but it has been theorized that viral proteases could also be attacked by splitting up the dimers into single proteins, or monomers. The problem is, the free monomer structures aren’t known.

This is where the M-PMV protease comes in. Although it is homologous to the dimeric proteases, M-PMV protease is a monomer in the absence of its cutting target. If we knew this protein’s structure, we could perhaps design drugs that would stabilize other proteases in their monomer form, rendering them inactive. An attempt to determine the structure using magnetic resonance data (NMR) produced models that seemed poorly folded and had bad ROSETTA energy scores. And, although the protein formed crystals, X-ray crystallography could not solve its structure either, despite a decade of effort.

An X-ray diffraction pattern.The reason for this has to do with how X-ray crystallography works. If you fire a beam of X-rays at a crystal of a protein, some of the rays will be deflected by electrons within it and you will observe a pattern of diffracted dots similar to the one at left, kindly provided by my colleague Young-Jin Cho. The intensities and locations of these dots depend on the structure and arrangement of the molecules within the crystal. X-ray crystallographers can use the diffraction patterns to calculate the electron density of the protein and fit the molecular bonds into it (below, also courtesy of Young-Jin). However, the electron density cannot be calculated from the diffraction pattern unless the phases of the diffracted X-rays are also known. Unfortunately there is no way to calculate the phases from the dots.

An electron density map

An electron density model (wireframe) with the chemical bonds of the peptide backbone (heavy lines) fitted into it.

There are many ways to solve this problem, but not all of them work in every system. One widely-applicable approach is called “molecular replacement”. In this method, a protein with a structure similar to that of the one being studied is used to guess the phases. If this guess is close enough, the structure factors can be refined from there. In the case of M-PMV protease, however, the dimeric homologues could not be used for replacement, and an attempt to use the NMR structure to calculate the phases also failed.

Then the Foldit players went to work. Starting from the NMR structure, Foldit players made a variety of refinements. A player called spvincent made some improvements using the new alignment tool, which a player called grabhorn improved further by rearranging the amino acid side chains in the core of the molecule. A player named mimi added the final touch by rearranging a critical loop.

Going from mimi’s structure (several others also proved suitable), the crystallographers were able to solve the phase problem by molecular replacement and finally determine the protease’s structure. None of the Foldit results were exactly right, so it’s inaccurate to say that the players solved the structure. However, their models were very close to the right answer, and provided the critical data that allowed the crystal structure to be solved. Once the paper is published, you’ll be able to find that structure at the PDB under the accession code 3SQF.

We can’t know right now whether this structure will enable the design of new drugs, but the Foldit players were the key to giving us a better chance of using it for this purpose. What may be even more exciting is the possibility that Foldit could be used in other structural studies to come up with improved starting models for molecular replacement. As with any method of predicting protein structures, however, the gold standard is CASP, so the Foldit teams participated in CASP9.

CASP9

The Critical Assessment of protein Structure Prediction is a long-running biennial test of computer algorithms to calculate a protein’s structure from its sequence. This experiment in prediction has a fairly simple setup.

1) Structural biologists give unpublished structures to the CASP organizers.

2) The sequences belonging to these structures are given to computational biologists.

3) After a set period, the computational predictions are compared to the known structural results.

The Baker group generated starting structures using ROSETTA, then handed the five lowest-energy results off to the Foldit players. For proteins that had known homologues, the results were disappointing. Foldit players did well, but they overused Foldit’s ROSETTA-based minimization routine, which tended to distort conserved loops.

An energy landscape showing an incorrect move towards a false minimum and a correct, more difficult move towards a true minimum.The nature of this problem became even more clear when the Baker group handed the Foldit players ROSETTA results for proteins that had no known homologues. In that case they noticed that players were using the minimization routine to “tunnel” to nearby, incorrect minima. You can get a feel for what that means by looking at the figure to the left.

In this energy landscape diagram, the blue line represents every possible structure of a pretend protein laid out in a line, with similar structures near each other and the higher-energy (worse) structures placed higher on the Y axis. From a relatively high-energy initial structure, Foldit players tended to use minimization to draw it ever-downward towards the nearest minimum-energy structure (red arrow). Overuse of the computer algorithm discouraged them from pulling the structure past a disfavored state that would then start to collapse towards the true, global minimum energy (green arrow).

The Foldit players still had some successes — for instance, they were able to recognize one structure ROSETTA didn’t like very much as a near-native structure. The Void Crushers team successfully optimized this structure, producing the best score for that particular target, and one of the highest scores of the CASP test. If the initial ROSETTA structures had too low of a starting energy, though, the players wouldn’t perturb them enough to get over humps in the landscape.

Thus, Baker’s group tried a new strategy. Taking the parts of one structure that they knew (from the CASP organizers) had a correct structure, they aligned the sequence with those parts and then took a hammer to the rest, pushing loops and structural elements out of alignment. This encouraged the players to be more daring in their remodeling of regions where the predictions had been poor, while preserving the good features of the structure. Again, the Void Crushers won special mention, producing the best-scoring structure of target TR624 in the whole competition.

Man over machine?

Does this prove that gamers know more about folding proteins than computers do? Some of them might, but Foldit doesn’t really use human expertise. Rather, the game uses human intelligence to identify when the ROSETTA program has gone down the wrong path and figure out how to push it over the hump. When the human intelligences aren’t daring enough, or trust the system too much, as in the case of the CASP results, Foldit doesn’t do any better than completely automated structural methods. When the human players are encouraged to challenge the computational results, however, the results can be striking. As Baker’s group are clearly aware, further development of the program needs to be oriented towards encouraging players to go further afield from the initial ROSETTA predictions. This will likely mean many more failed attempts by players, but also more significant successes like these.

Disclaimer: I am currently collaborating with David Baker’s group on a research project involving ROSETTA (but not Foldit).

1) Khatib, F., DiMaio, F., Cooper, S., Kazmierczyk, M., Gilski, M., Krzywda, S., Zabranska, H., Pichova, I., Thompson, J., Popović, Z., Jaskolski, M., & Baker, D. (2011). Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nature Structural & Molecular Biology DOI: 10.1038/nsmb.2119

Sep 062011
 

Over the last two decades, multiple kinds of NMR experiments have repeatedly shown that protein structures are quite variable, frequently shifting to minor conformations. The most striking evidence in this line has come from hydrogen-exchange experiments, which have demonstrated that virtually all proteins undergo excursions to partially-folded states at equilibrium. As R2 relaxation-dispersion experiments have become more widely used, excursions to alternative folded states have repeatedly been detected. The challenge now is to find ways to characterize these low-population states. Advanced crystallographic techniques have proven useful in determining some of these alternative structures. However, proteins are not always amenable to crystallography, and the minor state in the crystal may not correspond exactly to the minor state in solution. Therefore there is an ongoing effort to define these states by NMR. Lewis Kay’s group in Toronto is in the forefront of this effort, and recently reported the solution structure of a minor state of a T4 lysozyme mutant (1).

Lysozyme is an extremely common enzyme because it has the useful property of degrading the peptidoglycan that makes up bacterial cell walls. This makes it a natural antibiotic against gram-positive bacteria, and as a result it is found in many secretions and fluids, including saliva and egg whites. Because it is plentiful it has been widely studied, with many mutants made and characterized for their activity and stability. Lysozyme also crystallizes easily — doing this was actually part of my biochemistry lab class back in college. So, many structures of the enzyme and its mutants are available.

T4 lysozyme L99A with benzene boundOne lysozyme mutant that has interesting properties is the L99A mutant of the lysozyme from the T4 bacteriophage. This mutation creates a cavity in the upper part of the protein that is known to bind hydrophobic ligands such as benzene (right, benzene in purple, PDB code 3DMX). However, crystal structures show this binding pocket to be completely buried, even when empty. This poses the question of how the ligand gets in. Although the structure of L99A is very similar to WT, the Kay lab noticed that the NMR spectra of the mutant contained broadened peaks, indicating the presence of an exchange process between two conformations. Therefore, the Kay lab used R2 relaxation-dispersion to show that the protein sampled a minor state that accounted for 3% of the total protein, with a lifetime of about 1 ms (2). This conformation was presumed to be the binding-competent form of the protein. However, without a structure of this state, they could not confirm that the pocket was accessible. This led to their present attempts to characterize this low-population state using NMR.

As I have mentioned before, R2 relaxation-dispersion experiments can provide three important pieces of information: the populations of the two conformational states (pG, pE for ‘Ground’ and ‘Excited’), the rate of exchange between them (kEX = kGE + kEG), and the difference in chemical shift between the two states at each nucleus (|Δω|). Because the chemical shift is determined by the protein conformation, and because additional experiments can determine the sign of Δω, it should be possible to figure out the structure of the alternate state, given enough relaxation-dispersion data. Therefore, the Kay lab performed a large number of experiments to determine Δω for nearly all of the backbone 15N, 13C, and 1H atoms, as well as many side-chain methyl groups. They then fed this data to the CS-ROSETTA protocol, which can determine a protein structure using chemical shifts alone. While holding the majority of the protein in a single conformation, they allowed CS-ROSETTA to remodel the part of the mutant where they had detected conformational fluctuations.

Lysozyme minor state/major state overlay
Major state (green) and 5 lowest-energy conformers of the minor state (Excited) ensemble (blue)

Using this method, they were able to produce a structure of the transiently-populated minor state of the mutant protein, which I show to the left in comparison to the major conformation (PDB codes 2LCB and 3DMV, respectively, aligned using residues 10-100, 150-160). The most dramatic change is that two of the helices have been fused into one. As you can see, the new helix clashes with the usual position of phenylalanine 114 (pale green, because of the overlap it’s hard to see), which has in turn shifted so that it occupies part of the cavity where benzene binds (pale blue). This suggests, contra the Kay group’s earlier work, that the minor state is also incapable of binding to benzene.

This is a difficult prediction to test in the L99A system because the minor state (E) lives for such a short time that it’s difficult to tell whether anything binds to it or not. Therefore, Bouvignies et al. made a double-mutant protein with the L99A mutation and an additional G113A mutation that was predicted to stabilize the long helix observed in the minor form. This turned out to be the case: the E structure was enriched in the double mutant. In addition, the interconversion rate was slow enough that at low temperature distinct peaks could be observed for each conformation, as well as cross-peaks indicating exchange between them (I discussed this kind of experiment in my previous posts about cyclophilin). Under these conditions, the minor form is sufficiently populous and long-lived to determine whether ligands bind to it.

The Kay group did this by adding an equimolar amount of benzene to the reaction and observing whether there were exchange peaks. If you examine their figure 3c, it’s clear that exchange occurs between all three possible states: (G)round, (E)xcited, and (B)ound. This might seem to contradict their hypothesis. However, the E→B exchange peaks have very low intensity and take significantly longer to reach a maximum than the other exchange peaks. Therefore, this exchange peak may represent a low-frequency E→G→B event rather than direct exchange between the E and B states. Fits of the exchange curves seem to substantiate this interpretation, as the fit tended towards a value of zero for kEB and the χ2 jumped up significantly when kEB was fixed to a very low number.

My only concern with this result is that the kEG rate changes from ~31 to ~36 s-1 when benzene is added (kGE remains the same). It’s possible that the presence of benzene really does accelerate this process, or that the errors are underestimated. The model might also be janky in some hidden way, but my back-of-the-envelope check of the parameters suggests that the results are consistent with what is known about benzene binding to the L99A mutant, e.g. various ways of calculating the KD from these data produce a value of approximately 1 mM, matching earlier results.

If the E state does not represent a binding-competent state, that means the protein must be exchanging to yet another, still-undetected state. According to Bouviginies et al., the E structure they determined can account for all of the observed chemical exchange. If the alternative state that is capable of admitting benzene to the hydrophobic pocket cannot be detected by relaxation-dispersion experiments, it must constitute a very small fraction of the overall protein population (< 1%) and undergo very fast exchange. In principle, the existence of such a process can be detected using experiments designed to measure the intrinsic R2 of a residue, and also should be detectable using 1H experiments directed towards the methyl groups (the side chains likely represent the best bet for explaining the phenomenon). It does not appear that those experiments have been done yet, but I’m certain they’re underway.

Bouvignies et al. made a third construct incorporating the R119P mutation to stabilize the E state even further. This succeeded, producing a protein that spent most of its time in the E state and occasionally sampled the G state. The paper contains no data as to whether benzene detectably binds this mutant, although that strikes me as an obvious experiment to try. Presumably the obligate route through a high-energy intermediate would slow the kinetics of binding relative to the single mutant. If the penalty for adopting the G fold in this mutant is high enough, it might also significantly reduce the affinity.

The findings in this paper are not of any immediate practical use. The L99A mutant is a biophysical curiosity, not a disease target, and most of these techniques have been presented before, at least individually. However, this does serve as a very nice example of the advanced NMR methods that allow the determination of minor states, and of the surprising findings that can be derived from them. This paper should serve as a model approach to this sort of question, which may find broad applicability in the study of signaling, ligand binding, and protein evolution.


Disclaimer: I am currently collaborating with David Baker’s lab on a research project using ROSETTA.

1) Bouvignies G, Vallurupalli P, Hansen D, Correia B, Lange O, Bah A, Vernon R, Dahlquist FW, Baker D, & Kay LE (2011). Solution structure of a minor and transiently formed state of a T4 lysozyme mutant Nature, 477 (7362), 111-114 DOI: 10.1038/nature10349

2) Mulder FA, Mittermaier A, Hon B, Dahlquist FW, & Kay LE (2001). Studying excited states of proteins by NMR spectroscopy. Nature structural biology, 8 (11), 932-5 PMID: 11685237

Jun 212011
 
ResearchBlogging.orgAs I have mentioned before on this blog, the use of tools like CS-ROSETTA holds the promise of determining protein structures using only the chemical shifts of its backbone atoms. In addition to potentially making NOEs and RDCs redundant, this technology allows biologists to determine the conformations of minor members of the structural ensemble, which are very difficult to obtain using conventional approaches in population-dominated techniques like NMR and X-ray crystallography. There are two limitations here, however. First, we only gain insight into the backbone, and as we know, the positions of side chains in minor states can be critical for function. In addition, backbone chemical shifts are not always available due to relaxation problems. Both weaknesses could, in principle, be addressed by extracting conformational information from the chemical shifts of methyl groups, which report on side-chain behavior and continue to give good signal even in very large proteins. This is the rationale behind a series of recent papers from the Kay lab [1-3] intended to determine changes in side-chain rotameric state from methyl relaxation-dispersion data.

The roots of this idea have been around for a while, dating back at least to a 1996 paper in J. Biomol. NMR [4]. I’ve reproduced one of MacKenzie et al.‘s figures at right, and as you can see, for this protein (a peptide of glycophorin A), the correlation between the Cδ chemical shift and JCδCα is quite striking. However, the quality of the correlation appeared to be protein-dependent, as the R2 for this relationship was significantly lower for staphylococcal nuclease side-chains, possibly because they were positioned in a less homogeneous chemical environment than a lipid bilayer.

A more systematic study was recently performed by Bob London and co-workers from the National Institute of Environmental Health Sciences [5]. They extensively compared side-chain rotameric angles extracted from the PDB to side-chain chemical shift data from the Biological Magnetic Resonance data Bank to see what correlations emerged. They expected to see that the chemical shifts of the carbons depended on the side-chain dihedral angles due to the “γ-substituent effect”, which is believed to alter chemical shifts due to bond polarization caused by steric interactions. Although there are some complications due to other effects, this prediction turned out to be true, broadly speaking.

The left Thr has χ1=-60° while the right one has
χ1=60°. The rotation around the Cα-Cβ bond from
N to Oγ defines the dihedral angle.

London et al. found clear correlations between chemical shift and rotameric state for threonine, for instance, which has a true chiral center at Cβ. For χ1 of ± 60° (these angles are also referred to as gauche±), the chemical shift of the methyl carbon was around 22 ppm, while for χ1 of 180° (also called trans) the chemical shifts cluster loosely around 19 ppm. More broadly, London et al. observed that sterically crowded rotamers tended to move aliphatic carbon chemical shifts upfield. Structurally, the difference between these dihedral angles is that in the ±60° positions, Cγ2 has steric interactions with only one heavy atom (i.e. the amide N or carbonyl C), while in the 180° position it interacts with two.

As one might expect given the results of Mackenzie et al., London et al. also found a straightforward relationship in the case of the leucine δ carbons, where the population of rotamers could be determined rather simply using the difference between the δ1 and δ2 chemical shifts. While this only specifically gives the population of the trans rotamer (where Cδ1 is on the opposite side of the Cβ—Cγ bond from Cα), it turns out that, due to unfavorable sterics, population of the gauche- conformation is vanishingly small in the PDB, so that one can assert with some confidence that everything not in trans is in gauche+. Also, London et al. noted that the χ1 and χ2 angles were highly correlated for leucines, so that in principle the entire side-chain conformation could be defined using just the difference in Cδ chemical shifts.

Hansen et al. [1] decided to use the chemical shift-rotamer relationship to analyze the minor conformations of leucines in mutants of the Fyn SH3 domain. The G48M mutant is in a rapid equilibrium between folded and unfolded forms, while the A39V/N53P/V55L triple mutant appears to primarily exchange to an intermediate state. Using a combination of CPMG-based relaxation-dispersion experiements and HSQC/HMQC, the Kay lab were able to determine the chemical shifts of the leucine methyls in the alternate state for each mutant, and thus derive populations for the trans rotamer. In the unfolded state, on expects to see ~60-70% population of the trans rotamer. The folded state of Fyn SH3 has several leucines that lie outside this range, but in the minor form of G48M nearly all of them lie within it, consistent with the existing finding that this state is unfolded. In the case of the triple mutant, some leucines move into the unfolded range in the minor state, while others remain outside of it. This is consistent with the assignment of the minor state as a partially-folded intermediate.

In a subsequent paper, Hansen et al. derived a relatively simple method for estimating the population of the gauche- rotamer state for the isoleucine δ carbon and applied it to the same system [2]. The situation for the Ile Cδ is somewhat more complicated than that of leucine. Because it is an isolated methyl group, and the rest of the side chain has a complicated topology, as many as four unique rotamer positions are distinctly populated in the PDB. However, in solution only the trans and gauche- configurations are expected to be significantly populated.

The Fyn SH3 domain has two Ile residues, which by this technique appear to be populated primarily in the gauche- rotamer (I28) and the trans rotamer (I50) respectively. In the intermediate state (results from the unfolded state are not reported) both isoleucines populate the gauche- rotamer to about 20%. The authors interpret this as a non-native interaction in the case of I28 and a slight increase in dynamics in the case of I50. However, it seems that these values could also support a case that both side chains are totally (or almost totally) solvent-exposed in the intermediate state, and thus adopting random-coil configurations.

One might also take issue with the idea that an increase from 0 to 20% of an alternate rotamer population represents a “slight” increase in dynamics. It’s difficult to make any firm statement in this regard because we don’t actually know the rotamer distribution in the folded state: Cδ1 may be entirely in trans, or averaged somehow between all of the non-gauche states. The authors take the folded state to be essentially pure trans, from which one would plausibly expect to observe an order parameter of 0.8 or higher for the methyl group (according to the rough calculations in [6], see reproduced figure on left). Based on the population, the order parameter would decrease to around 0.5 in the intermediate, a fairly large change.

However, this does not undermine the conclusion that the core is relatively well-formed in the intermediate. One perplexing feature of methyl side-chain order parameters is that they correlate poorly with nearly every structural feature one might expect to explain them [7]. Solvent-accessible surface area, packing density, and depth of burial are all rather poor predictors of side-chain dynamics. By the same token, more rudimentary measures, such as methyl distance from the backbone, are relatively robust predictors of dynamics, even though they ignore the higher-order structure of the protein. The upshot of this is that any data obtained about the dynamics of side-chains in minor states will need to be interpreted conservatively.

In the most recent offshoot of this research, Hansen and Kay published a paper correlating the chemical shift of valine Cγ methyls with the rotameric state [3]. Unfortunately, this is not a case where there’s a simple calculation that can accurately spit out the χ1 angle, and because of the β-branched structure of the amino acid, it’s not possible to rule out one of the possible angles a priori. As their Figure 2 shows, the relationship between the chemical shifts and the rotamer is complicated and may also vary with the local secondary structure. Instead of a simple formula, they were able to derive a “surface” reflecting probabilities of particular rotamer arrangements based on the shifts, which can then be analyzed using a program they wrote. They subsequently validated this approach on a very large protein complex, the half-proteasome, by comparing the chemical shift-derived rotameric states to those observed in crystallographic data.

I tested their chemical-shift based predictions against some of my own data (a web-based version of the program is available at Flemming Hansen’s website) and wasn’t exactly blown away by the results. Of 10 methyls in my protein, the primary rotamer was completely wrong (as determined by experimental 3J measurements) in two cases, and the population of the primary rotamer was dramatically overestimated in another two. However, the two side-chains with incorrect rotamer determinations were both adjacent to tryptophan side-chains, and in those cases the ring currents may have altered the chemical shift enough to inferfere with the calculation. Because aromatic rings are likely to be present in the core and may enhance the possibility of observing chemical exchange, this may bear further investigation. Nonetheless, the primary rotamer was usually correct chosen, and so these calculations can serve as at least a starting point for structural analysis.

These introductory studies are fairly encouraging, and suggest that it should be possible to use CPMG experiments to assess structural features of minor states beyond just the backbone conformation, even in very large systems. This may be especially helpful in analyzing the dynamics of proteins with hydrophobic active or regulatory sites. As hydrophobic surfaces are often involved in protein-protein interactions, an improved understanding of these critical binding events may result.


Disclosure: I have co-authored a paper with Bob London’s group, as well as several (obviously) with Andrew Lee’s.

1. Hansen, D., Neudecker, P., Vallurupalli, P., Mulder, F.A.A., & Kay, L. (2010). “Determination of Leu Side-Chain Conformations in Excited Protein States by NMR Relaxation Dispersion.” Journal of the American Chemical Society, 132 (1), 42-43 DOI: 10.1021/ja909294n

2. Hansen, D.F., Neudecker, P., & Kay, L.E. (2010). “Determination of Isoleucine Side-Chain Conformations in Ground and Excited States of Proteins from Chemical Shifts.” Journal of the American Chemical Society, 132 (22), 7589-7591 DOI: 10.1021/ja102090z

3. Hansen, D.F., & Kay, L.E. (2011). “Determining Valine Side-Chain Rotamer Conformations in Proteins from Methyl 13C Chemical Shifts: Application to the 360 kDa Half-Proteasome.” Journal of the American Chemical Society, 133 (21), 8272-8281 DOI: 10.1021/ja2014532

4. MacKenzie KR, Prestegard JH, & Engelman DM (1996). “Leucine side-chain rotamers in a glycophorin A transmembrane peptide as revealed by three-bond carbon-carbon couplings and 13C chemical shifts.” Journal of Biomolecular NMR, 7 (3), 256-60 PMID: 8785502

5. London, R., Wingad, B., & Mueller, G. (2008). “Dependence of Amino Acid Side Chain 13C Shifts on Dihedral Angle: Application to Conformational Analysis.” Journal of the American Chemical Society, 130 (33), 11097-11105 DOI: 10.1021/ja802729t

6. Hu, H., Hermans, J., & Lee, A. (2005). “Relating side-chain mobility in proteins to rotameric transitions: Insights from molecular dynamics simulations and NMR” Journal of Biomolecular NMR, 32 (2), 151-162 DOI: 10.1007/s10858-005-5366-0

7. Igumenova, T., Frederick, K., & Wand, A. (2006). “Characterization of the Fast Dynamics of Protein Amino Acid Side Chains Using NMR Relaxation in Solution.” Chemical Reviews, 106 (5), 1672-1699 DOI: 10.1021/cr040422h