-
蛋白质loop区的结构预测是理解蛋白质功能的重要一环,而长loop区的结构预测至今还是生物信息学中的难题. 目前已经出现了多种loop结构的算法,其中LEAP是预测精度最高的算法之一,但它在长loop区初始主链构象采样上仍有较大的改进余地. 本文中我们将蛋白质二级结构预测算法SPINE X与LEAP算法结合起来,构建了新的主链扭转角分布图(拉氏图),在主链初始构象采样中引入氨基酸在蛋白序列中的位置特异性信息,使得初始构象的采样更具针对性. 对取自CASP10单链蛋白的loop测试集的分析表明,对长度为10,11,12个氨基酸的长loop区,改进后算法都比原始LEAP算法的预测精度有显著提升. 这种引入氨基酸位置特异性从而提高预测精度的思路有望进一步推广至loop结构预测的其他算法.Loop region is necessary structural element of protein molecule, and plays significant roles in protein functioning, e.g., in signaling, ligand recognition. Unlike the well-defined secondary structures (i.e., helix, sheet), however, loop regions vary in structure and some of them are even not able to be measured by ordinary experimental methods. For these reasons, computer-aided prediction of loop structure became a hotspot in bioinformatics and biophysics. Sorts of algorithms have been developed for this purpose. So far, however, the prediction of long loop is still a challenge. Among all the common algorithms, LEAP algorithm achieves the highest precision on long loop prediction. Our investigation on a test data set with LEAP algorithm reveals that the ultimate loop structure predicted by LEAP is almost entirely determined by the initial sampling of the conformation of the loop backbone. If all the backbone conformations in the initial sampling are quite distant from the real (native) conformation, the ultimately predicted structure is also distant from the native conformation, and the prediction accuracy cannot be improved obviously only by increasing the computation time. In the original LEAP, the initial sampling is based on the rough distribution of the backbone torsion angle (Ramachandran plot, R-plot) which doesn't consider the sequence information of the loop region. Many conformations which are far from the native conformation are most likely generated in the sampling. So there raises the open question, is it possible to enhance the initial sampling to be more targeted to the native conformation? In this paper, we suggest an approach to introduce the position-specific amino-acid sequence information into the initial sampling of the backbone conformation, which may generate more targeted initial decoys. An algorithm of protein secondary structure prediction, SPINE X, is used to generate rough but reasonable estimates of torsion angles of each amino acid of the loop backbone in sequence-dependent way. We then combine these values with the original R-plot to reconstruct a new R-plot for each amino acid in the loop, and the initial sampling is performed according to the new R-plot. We applied this new algorithm to a test set of loops (generated from single-chain proteins in CASP 10), and found the medians/means of RMSDs can reduce about 0.12 /0.13 , 0.25 /0.27 , 0.47 /0.27 for loop sets of length 10, 11, 12, respectively. Comparing to the original LEAP algorithm, the probability of making more accurate predictions is almost doubled when using the refined algorithm. The logic of our approach is not limited to LEAP, and can be extended to other algorithms which are also significantly dependent on initial sampling.
-
Keywords:
- loop structure prediction /
- initial conformation of peptide backbone /
- position specificity of amino acids /
- Ramachandran plot
[1] Anfinsen C B, Redfield R R, Choate W L, Page J, Carroll W R 1954 J. Biol. Chem. 207 201
[2] Decanniere K, Muyldermans S, Wyns L 2000 J. Mol. Biol. 300 83
[3] Likitvivatanavong S, Aimanova K G, Gill S S 2007 FEBS Lett. 583 2021
[4] Lepsik M, Field M J 2007 J. Phys. Chem. B 111 10012
[5] Sutcliffe M J, Haneef I, Carney D, Blundell T L 1987 Protein Eng. 1 377
[6] Tossato C E, Bindewald E, Hesser J, Maenner R 2002 Protein Eng. 15 279
[7] Lee J, Lee D, Park H, Coutsias E A, Seok C 2010 Proteins: Struct., Funct., Bioinf. 78 3428
[8] Fiser A, Do R K, Sali A 2000 Protein Sci. 9 1753
[9] Spassov V Z, Flook P K, Yan L 2008 Protein Eng., Des. Sel. 21 91
[10] Jacobson M P, Pincus D L, Rapp C S, Day T J F, Honig B, Shaw D W, Friesner R A 2004 Proteins: Struct., Funct., Bioinf. 55 351
[11] Zhu K, Pincus D L, Zhao S W, Friesner R A 2006 Proteins: Struct., Funct., Bioinf. 65 438
[12] Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner R A 2011 Proteins: Struct., Funct., Bioinf. 79 2794
[13] Xiang Z, Soto C S, Honig B 2002 Proc. Natl. Acad. Sci. U. S. A. 99 7432
[14] Soto C S, Fasnacht M, Zhu J, Forrest L, Honig B 2008 Proteins: Struct., Funct., Bioinf. 70 834
[15] Rohl C A, Strauss C E M, Chivian D, Baker D 2004 Proteins: Struct., Funct., Bioinf. 55 656
[16] Liang S, Zhang C, Zhou Y 2014 J. Comput. Chem. 35 335
[17] Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y 2012 J. Comput. Chem. 33 259
[18] Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y 2015 Sci. Rep. 5 11476
-
[1] Anfinsen C B, Redfield R R, Choate W L, Page J, Carroll W R 1954 J. Biol. Chem. 207 201
[2] Decanniere K, Muyldermans S, Wyns L 2000 J. Mol. Biol. 300 83
[3] Likitvivatanavong S, Aimanova K G, Gill S S 2007 FEBS Lett. 583 2021
[4] Lepsik M, Field M J 2007 J. Phys. Chem. B 111 10012
[5] Sutcliffe M J, Haneef I, Carney D, Blundell T L 1987 Protein Eng. 1 377
[6] Tossato C E, Bindewald E, Hesser J, Maenner R 2002 Protein Eng. 15 279
[7] Lee J, Lee D, Park H, Coutsias E A, Seok C 2010 Proteins: Struct., Funct., Bioinf. 78 3428
[8] Fiser A, Do R K, Sali A 2000 Protein Sci. 9 1753
[9] Spassov V Z, Flook P K, Yan L 2008 Protein Eng., Des. Sel. 21 91
[10] Jacobson M P, Pincus D L, Rapp C S, Day T J F, Honig B, Shaw D W, Friesner R A 2004 Proteins: Struct., Funct., Bioinf. 55 351
[11] Zhu K, Pincus D L, Zhao S W, Friesner R A 2006 Proteins: Struct., Funct., Bioinf. 65 438
[12] Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner R A 2011 Proteins: Struct., Funct., Bioinf. 79 2794
[13] Xiang Z, Soto C S, Honig B 2002 Proc. Natl. Acad. Sci. U. S. A. 99 7432
[14] Soto C S, Fasnacht M, Zhu J, Forrest L, Honig B 2008 Proteins: Struct., Funct., Bioinf. 70 834
[15] Rohl C A, Strauss C E M, Chivian D, Baker D 2004 Proteins: Struct., Funct., Bioinf. 55 656
[16] Liang S, Zhang C, Zhou Y 2014 J. Comput. Chem. 35 335
[17] Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y 2012 J. Comput. Chem. 33 259
[18] Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y 2015 Sci. Rep. 5 11476
计量
- 文章访问数: 10131
- PDF下载量: 311
- 被引次数: 0