1 Present and Future Prospects of Protein Structure Prediction Sung-Joon Park George Chikenji Takatsugu Hirokawa Kentaro Tomii Shoji Takada Department of Chemistry, Faculty of Science, Kobe University park@proteinsilico.org, http://www.proteinsilico.org/ chikenji@theory.chem.sci.kobe-u.ac.jp, http://theory.chem.sci.kobe-u.ac.jp/~chikenji/ Computational Biology Research Center, Advanced Industrial Science and Technology t-hirokawa@aist.go.jp, http://www.cbrc.jp/ k-tomii@aist.go.jp, http://www.cbrc.jp/ Department of Chemistry, Faculty of Science, Kobe University stakada@kobe-u.ac.jp, http://theory.chem.sci.kobe-u.ac.jp/ keywords: protein structure prediction, comparative modeling, fold recognition, ab initio, CASP 1. 40 X 1994 2 CASP Critical Assessment of Techniques for Protein Structure Prediction 2004 CASP6 2. 2 1 C α H NH 2 COOH C α 1 NH 2 COOH N C α β 2 2 20 200 20 200 [Liu 04] 30%
2 20 4 0512 2005 [ 05b] 2 3 100 3 200 ( 10 95 ) 2 4 1 1990 [Tramontano 03] [Kolodny 02] [Simons 99] CASP 2 CASP 1994 CASP 2 CASP CASP 5 9 1 CASP Predictor Target Prediction Predictor Target Prediction CASP1 35 33 100 CASP2 152 42 947 CASP3 120 43 3807 CASP4 198 43 11136 CASP5 259 67 28728 CASP6 266 87 41283 48 1 5 12 Proteins: Structure, Function, and Bioinformatics CASP 2004 CASP6 200 87 1 3. 3 1 1 PSI-BLAST[Altschul 97] Comparative Modeling CM CM (1) (2) (3)
3 (4) 3) CM 2 [Chothia 92] Fold Recognition FR FR FR [Fiser 00] [Kosinski 03] 3 PSI-BLAST PSI-BLAST IMPALA[Schaffer 99] [Wang 04] 3 2 1 ab initio ab initio ab initio ab initio de novo CM free modeling ab initio
4 20 4 0512 2005 ab initio 1960 1990 1998 CASP3 ab initio [Simons 99] 2 CASP3 Baker Fragment Assembly FA [Simons 99] FA FA ( 9 ) ( 20 ) Baker CASP3 4 5 [Simons 99, Bonneau 01, Bradley 03] FA ab initio [Takada 01, Chikenji 03, Fujitsuka 04, 05a] Skolnick Kolinski [Zhang 03] Scheraga [Liwo 99] 4. CASP6 4 1 CASP6 1 CASP6 64 90 2 CM CM FR/H FR/A NF ab initio 2 BLAST CASP6 PSI-BLAST CM FR NF easy hard H A 25 18 22 15 10 CM/easy: Comparative Modeling (easy) CM/hard: Comparative Modeling (hard) FR/H: Fold Recognition (Homologous) FR/A: Fold Recognition (Analogous) NF: New Fold E-value< 0.01 5 iterations FR/H FR/A 2 CASP GDT TS Global Distance Test Total Score [Zemla 03] GDT TS = GDT P 1 +GDT P 2 +GDT P 4 +GDT P 8, 4 (1) GDT P x C α xå 1 Model 1 1 GDT TS 10 GDT TS 1 Root Mean Squared Distance RMSD 2Y NF CASP usual suspects [Cozzetto 05] 10 CBRC-3D Chimera Rokko Rokky http://www.proteinsilico.org/rokky/ 1 5
5 1 Model 1 GDT TS 10 GDT TS 1 RMSD 2 T0212 2 21 CASP6 4 2 CM FR/H CASP5 Pcons[Lundstrom 01] 3D-SHOTGUN[Fischer 03] 3D- Jury[Ginalski 03] CASP6 Ginalski 3D-Jury Meta-BASIC[Ginalski 04] FR/H 2 GeneSilico 3 CBRC-3D FORTE[Tomii 04] T0212 2 CBRC-3D 4 3 FR/A NF CASP6 ab initio
6 20 4 0512 2005 Rokko FA FA 3 T0281 FR/A CASP6 ab initio Baker T0281 T0281 PDB 1WHZ 3c 1.52Å X 70 α/β Baker FA T0281 RMSD 2.2Å RMSD 1.59Å 4 T0201 NF 3 CASP6 FR/A NF 1 T0215 FR/A T0215 PDB 2 1X9B 3a 53 All α T0215 Scheraga Samudrala-AB PDB SA Simulated Annealing energy-based prediction RMSD 5.0Å FA Baker Rokko T0215 1X9B 2 T0198 FR/A T0198 PDB 1SUM 3b 225 All α T0198 FA Baker 2 Protein Data Bank http://www.rcsb.org/pdb/ T0201 PDB 1S12 3d 94 α/β Kolinski&Bujnicki FRankenstein s monster [Kosinski 03] ab initio CABS [Zhang 03] 0.61Å RMSD 3.5Å β FA β 5 ab initio 100 ab initio All α FA ab initio ab initio All β Baker CASP6 Brokenchain FA 4 β FA T0212 RMSD 6.2Å
7 [Chikenji 04] 4 Broken-chain 6. 5. 5 1 60% 1Å [Baker 01] X NMR 30% 60% 1Å 2Å 30% NIH National Institutes of Health 2005 X High-accuracy Protein Structure Modeling 3 75 5 2 ab initio ab initio X ab initio ab initio CASP FR/A FR/A PDB ab initio [Bonneau 02] ab initio ab initio [Kuhlman 03] ab initio CASP CASP 10 CASP6 ab initio ab initio [Altschul 97] Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Miller, W., and Lipman, D. J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., Vol. 25, pp. 3389 3402 (1997) [Baker 01] Baker, D. and Sali, A.: Protein structure prediction and structural genomics, Science, Vol. 294, pp. 93 96 (2001) [Bonneau 01] Bonneau, R., Tsai, J., Ruczinski, I., Chivian, D., Rohl, C., Strauss, C. E. M., and Baker, D.: Rosetta in CASP4: Progress in ab initio protein structure prediction, Proteins, Vol. 45, pp. 119 126 (2001) [Bonneau 02] Bonneau, R., Rohl, C. E. S. C. A., Chivian, D., Bradley, P., Malmstrom, L., Robertson, T., and Baker, D.: De novo prediction of three-dimensional structures for major protein families, J. Mol. Biol., Vol. 322, pp. 65 78 (2002) [Bradley 03] Bradley, P., Chivian, D., Meiler, J.,
8 20 4 0512 2005 Misura, K. M., Rohl, C. A., Schief, W. R., Wedemeyer, W. J., Schueler-Furman, O., Murphy, P., Schonbrun, J., Strauss, C. E., and Baker, D.: Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation, Proteins, Vol. 53, pp. 457 468 (2003) [Chikenji 03] Chikenji, G., Fujitsuka, Y., and Takada, S.: A reversible fragment assembly method for de novo protein structure prediction, J. Chem. Phys., Vol. 119, pp. 6895 6903 (2003) [Chikenji 04] Chikenji, G., Fujitsuka, Y., and Takada, S.: Protein folding mechanisms and energy landscape of src SH3 domain studied by a structure prediction toolbox, Chemical Phys., Vol. 307, pp. 99 109 (2004) [Chothia 92] Chothia, C.: One thousand families for the molecular biologist, Nature, Vol. 357, pp. 543 544 (1992) [Cozzetto 05] Cozzetto, D., Matteo, A. D., and Tramontano, A.: Ten years of predictions... and counting, FEBS Journal, Vol. 272, pp. 881 882 (2005) [Fischer 03] Fischer, D.: 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor, Proteins, Vol. 51, pp. 434 441 (2003) [Fiser 00] Fiser, A., Do, R. K., and Sali, A.: Modeling of loops in protein structures, Protein Sci., Vol. 9, pp. 1753 1773 (2000) [Fujitsuka 04] Fujitsuka, Y., Takada, S., Luthey- Schulten, Z. A., and Wolynes, P. G.: Optimizing physical energy functions for protein folding, Proteins, Vol. 54, pp. 88 103 (2004) [Ginalski 03] Ginalski, K., Elofsson, A., Fischer, D., and Rychlewski, L.: 3D-Jury: a simple approach to improve protein structure predictions, Bioinformatics, Vol. 19, pp. 1015 1018 (2003) [Ginalski 04] Ginalski, K., Grotthuss, von M., Grishin, N. V., and Rychlewski, L.: Detecting distant homology with Meta- BASIC, Nucleic Acids Res., Vol. 32, pp. W576 581 (2004) [Kolodny 02] Kolodny, R., Koehl, R., Guibas, L., and Levitt, M.: Small libraries of protein fragments model native protein structures accurately, J. Mol. Biol., Vol. 323, pp. 297 307 (2002) [Kosinski 03] Kosinski, J., Cymerman, I. A., Feder, M., Kurowski, M. A., Sasin, J. M., and Bujnicki, J. M.: A FRankenstein s monster approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation, Proteins, Vol. 53, pp. 369 379 (2003) [Kuhlman 03] Kuhlman, B., Dantas, B., Ireton, G. C., Varani, G., Stoddard, B. L., and Baker, D.: Design of a novel globular protein fold with atomic-level accuracy, Science, Vol. 302, (2003) [Liu 04] Liu, X. and Wang, W.: The number of protein folds and their distribution over families in nature, Proteins, Vol. 54, pp. 491 499 (2004) [Liwo 99] Liwo, A., Lee, J. S., Ripoll, D. R., Pillardy, J., and Scheraga, H. A.: Protein structure prediction by global optimization of a potential energy function, Proc. Natl. Acad. Sci. USA. (1999) [Lundstrom 01] Lundstrom, J., Rychlewski, L., Bujnicki, J., and Elofsson, A.: Pcons: a neural-network-based consensus predictor that improves fold recognition, Protein Sci., Vol. 10, pp. 2354 2362 (2001) [Schaffer 99] Schaffer, A. A., Wolf, Y. I., Ponting, C. P., Koonin, E. V., Aravind, L., and Altschul, S. F.: IMPALA: matching a protein sequence against a collection of PSI- BLAST-constructed position-specific score matrices, Bioinformatics, Vol. 15, pp. 1000 1011 (1999) [Simons 99] Simons, K. T., Bonneau, R., Ruczinski, I., and Baker, D.: Ab initio protein structure prediction of CASP III targets using ROSETTA, Proteins, Vol. 37, pp. 171 176 (1999) [Takada 01] Takada, S.: Protein folding simulation with solvent-induced force field: folding pathway ensemble of three-helix-bundle proteins, Proteins, Vol. 42, pp. 85 98 (2001) [Tomii 04] Tomii, K. and Akiyama, Y.: FORTE: a profileprofile comparison tool for protein fold recognition, Bioinformatics, Vol. 20, pp. 594 595 (2004) [Tramontano 03] Tramontano, A. and Morea, V.: Assessment of homology-based predictions in CASP5, Proteins, Vol. 53, pp. 352 368 (2003) [Wang 04] Wang, G. and Dunbrack, R. L.: Scoring profileto-profile sequence alignments, Protein Sci., Vol. 13, pp. 1612 1626 (2004) [Zemla 03] Zemla, A.: LGA: a method for finding 3D similarities in protein structures, Nucleic Acids Res., Vol. 31, pp. 3370 3374 (2003) [Zhang 03] Zhang, Y., Kolinski, A., and Skolnick, J.: TOUCHSTONE II: a new approach to ab initio protein structure prediction, Biophysics J., Vol. 85, pp. 1145 1164 (2003) [ 05a],, SICE 32, pp. 289 294 (2005) [ 05b],, GA 2,, Vol. 43, pp. 898 910 (2005) 1998 2005 2005 4 1997 2002 2002 2005 4 1998 2001 2003 2003 4 1998 1998 2000 University of California Berkeley 2001 1988 1990 1991 1995 1998 2001