Publication

Deciphering the preference and predicting the viability of circular permutations in proteins

Downloadable Content

Persistent URL
Last modified
  • 05/15/2025
Type of Material
Authors
    Wei-Cheng Lo, National Tsing Hua UniversityTian Dai, Emory UniversityYen-Yi Liu, National Chiao Tung UniversityLi-Fen Wang, National Chiao Tung UniversityJenn-Kang Hwang, National Chiao Tung UniversityPing-Chiang Lyu, National Tsing Hua University
Language
  • English
Date
  • 2012-02-16
Publisher
  • Public Library of Science
Publication Version
Copyright Statement
  • © 2012 Lo et al.
License
Final Published Version (URL)
Title of Journal or Parent Work
ISSN
  • 1932-6203
Volume
  • 7
Issue
  • 2
Start Page
  • e31791
End Page
  • e31791
Grant/Funding Information
  • This work was funded by the National Science Council, Taiwan (http://www.nsc.gov.tw) with NSC grant number 99-2745-B-009-001-ASP (Academic Summit Program) to JH; 99-2811-M-007-073 and 99-3112-B-007-003 to PL.
  • The Ministry of Education, Taiwan (http://english.moe.gov.tw/) also funded this study with the Development Plan for World Class University and Research Centers of Excellence (MOU ATU Plan).
Supplemental Material (URL)
Abstract
  • Circular permutation (CP) refers to situations in which the termini of a protein are relocated to other positions in the structure. CP occurs naturally and has been artificially created to study protein function, stability and folding. Recently CP is increasingly applied to engineer enzyme structure and function, and to create bifunctional fusion proteins unachievable by tandem fusion. CP is a complicated and expensive technique. An intrinsic difficulty in its application lies in the fact that not every position in a protein is amenable for creating a viable permutant. To examine the preferences of CP and develop CP viability prediction methods, we carried out comprehensive analyses of the sequence, structural, and dynamical properties of known CP sites using a variety of statistics and simulation methods, such as the bootstrap aggregating, permutation test and molecular dynamics simulations. CP particularly favors Gly, Pro, Asp and Asn. Positions preferred by CP lie within coils, loops, turns, and at residues that are exposed to solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored positions include Cys, bulky hydrophobic residues, and residues located within helices or near the protein's core. These results fostered the development of an effective viable CP site prediction system, which combined four machine learning methods, e.g., artificial neural networks, the support vector machine, a random forest, and a hierarchical feature integration procedure developed in this work. As assessed by using the hydrofolate reductase dataset as the independent evaluation dataset, this prediction system achieved an AUC of 0.9. Large-scale predictions have been performed for nine thousand representative protein structures; several new potential applications of CP were thus identified. Many unreported preferences of CP are revealed in this study. The developed system is the best CP viability prediction method currently available. This work will facilitate the application of CP in research and biotechnology.
Author Notes
Keywords
Research Categories
  • Biology, Molecular
  • Biology, Biostatistics

Tools

Relations

In Collection:

Items