
However, a series of publications over the past year or so has pointed towards some steady progress towards developing structures from chemical shifts alone, with a paper from the labs of Ad Bax and David Baker now in PNAS preprints showing some of the best progress yet (1). The approach used much resembles the CHESHIRE method described last year by Michele Vendruscolo (2), in that both are based on fragment replacement. The Vendruscolo group’s paper explicitly compares CHESHIRE to David Baker’s ROSETTA program. So it seems only natural for Shen et al. to incorporate refinements based on chemical shift directly into the ROSETTA program to create CS-ROSETTA. The standard ROSETTA approach is to break the protein up into small overlapping fragments of several peptides. A library of structures (the PDB) is then searched with these fragments to obtain a set of about 200 potential conformations based on sequence similarity. ROSETTA then attempts to assemble low-energy (stable) structures out of these potential fragment conformations. CS-ROSETTA uses chemical shift data at two distinct steps. First, chemical shift data are used to select the most appropriate potential conformations from the library, theoretically improving the “building materials” for ROSETTA. In later stages, the consistency between the ROSETTA-predicted structures and the known chemical shifts is used to re-score their energy. That this can significantly improve the ROSETTA output can be seen from the part of Shen et al.‘s Figure 2 that I have shamelessly stolen for your benefit. Shen et al. optimized CS-ROSETTA against 16 known structures. I checked their results back against the CHESHIRE results. Five proteins were predicted in both papers, and CS-ROSETTA did a better job in terms of backbone atom RMSD for four of them. On average, CS-ROSETTA produced a 24% reduction in this RMSD relative to CHESHIRE. Also, Shen et al. tested CS-ROSETTA blindly against nine proteins whose structures had been recently solved by the Northeast Structural Genomics Consortium, with favorable results. This isn’t the end of the road by a long shot. Backbone RMSDs for these predictions are generally <2 Å, which is easily good enough for picking out general characteristics of a fold. Identifying subtle features, however, will probably require higher precision and thus more rigorous refinement. However, having these predicted conformations in hand may significantly accelerate the assignment and refinement of structures using NOE data. Combining fragment-replacement approaches based on RDC data and chemical shift may also produce significant improvements.
There were other limitations. Shen et al. were not able to converge structures for every protein attempted. CS-ROSETTA is presently limited to proteins smaller than many routinely solved by NMR, and proteins with unusual or complicated topologies may not be solvable using this approach. And, of course, the presence of cofactors that significantly alter local chemical shifts will significantly complicate analyses of this kind, if not render them impossible. Obviously, a great deal of work remains to be done before computational approaches will be capable of tackling the large, highly degenerate systems where they would have the most power to resolve problems. However, the excellent results of CHESHIRE and CS-ROSETTA suggest that our ability to derive structures from limited NMR data will improve dramatically in the next few years. 1. Shen, Y., Lange, O., Delaglio, F., Rossi, P., Aramini, J.M., Liu, G., Eletsky, A., Wu, Y., Singarapu, K.K., Lemak, A., Ignatchenko, A., Arrowsmith, C.H., Szyperski, T., Montelione, G.T., Baker, D., Bax, A. (2008). Consistent blind protein structure generation from NMR chemical shift data. Proceedings of the National Academy of Sciences, 105 (12), 4685-4690. DOI: 10.1073/pnas.0800256105 2. Cavalli, A., Salvatella, X., Dobson, C.M., Vendruscolo, M. (2007). Protein structure determination from NMR chemical shifts. Proceedings of the National Academy of Sciences, 104(23), 9615-9620. DOI: 10.1073/pnas.0610313104 POSTSCRIPT: You can read another take on this paper at Plausible Accuracy. These are predictions for calbindin (B) and HPr (C), with the ROSETTA predictions on top and the rescored energies on the bottom. As you can see, the calbindin structures do not have a well-defined energy minimum in the ROSETTA prediction, and the HPr structure has three minima which are not all close to the actual structure as measured by Cα root mean square deviations (RMSDs). Rescoring, however, produces funnel-shaped distributions of energy with respect to RMSD, such that low energies reliably indicate structures close to reality.
3 Responses to “Protein Structure from Chemical Shifts Alone”
Sorry, the comment form is closed at this time.
When you say "First, chemical shift data are used to select the most appropriate potential conformations from the library", do you mean that low-energy conformations are selected based on their characteristic chemical shifts? That would seem to me to be a better technique than using a potential energy function with all its pitfalls to decide which conformations are "low-energy". Nice blog by the way.
It's obvious your not a Chemist or a Physicist. Stick to your field, 'Bio basics'.
[...] 15N, 13C, and 1H atoms, as well as many side-chain methyl groups. They then fed this data to the CS-ROSETTA protocol, which can determine a protein structure using chemical shifts alone. While holding the majority of the protein in a single conformation, they allowed CS-ROSETTA to [...]