Keywords: loop modeling, GeneralizedKIC
# Notebook setup
import sys
if 'google.colab' in sys.modules:
!pip install pyrosettacolabsetup
import pyrosettacolabsetup
pyrosettacolabsetup.setup()
print ("Notebook is set for PyRosetta use in Colab. Have fun!")
import pyrosetta
pyrosetta.init()
# py3Dmol setup
!pip install py3Dmol
import glob
import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta.distributed
import pyrosetta.distributed.io as io
import pyrosetta.distributed.viewer as viewer
PyRosetta-4 2019 [Rosetta PyRosetta4.conda.mac.python36.Release 2019.38+release.f3cd88e4837a35eba6d19fd9c30ebf5748c77cae 2019-09-18T23:40:11] retrieved from: http://www.pyrosetta.org (C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team. core.init: {0} Checking for fconfig files in pwd and ./rosetta/flags core.init: {0} Rosetta version: PyRosetta4.conda.mac.python36.Release r232 2019.38+release.f3cd88e4837 f3cd88e4837a35eba6d19fd9c30ebf5748c77cae http://www.pyrosetta.org 2019-09-18T23:40:11 core.init: {0} command: PyRosetta -ex1 -ex2aro -database /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages/pyrosetta/database basic.random.init_random_generator: {0} 'RNG device' seed mode, using '/dev/urandom', seed=301741349 seed_offset=0 real_seed=301741349 thread_index=0 basic.random.init_random_generator: {0} RandomGenerator:init: Normal mode, seed=301741349 RG_type=mt19937 Requirement already satisfied: py3Dmol in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (0.8.0) Requirement already satisfied: jupyter in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from py3Dmol) (1.0.0) Requirement already satisfied: idisplay in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from py3Dmol) (0.1.2) Requirement already satisfied: nbconvert in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (5.6.1) Requirement already satisfied: qtconsole in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (4.7.3) Requirement already satisfied: ipykernel in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (5.1.4) Requirement already satisfied: ipywidgets in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (7.5.1) Requirement already satisfied: notebook in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (6.0.3) Requirement already satisfied: jupyter-console in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter->py3Dmol) (6.1.0) Requirement already satisfied: ipython>=1.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from idisplay->py3Dmol) (7.13.0) Requirement already satisfied: bleach in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (3.1.4) Requirement already satisfied: defusedxml in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (0.6.0) Requirement already satisfied: entrypoints>=0.2.2 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (0.3) Requirement already satisfied: jinja2>=2.4 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (2.11.2) Requirement already satisfied: jupyter-core in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (4.6.3) Requirement already satisfied: mistune<2,>=0.8.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (0.8.4) Requirement already satisfied: pandocfilters>=1.4.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (1.4.2) Requirement already satisfied: traitlets>=4.2 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (4.3.3) Requirement already satisfied: pygments in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (2.6.1) Requirement already satisfied: nbformat>=4.4 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (5.0.4) Requirement already satisfied: testpath in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbconvert->jupyter->py3Dmol) (0.4.4) Requirement already satisfied: jupyter-client>=4.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from qtconsole->jupyter->py3Dmol) (6.1.3) Requirement already satisfied: pyzmq>=17.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from qtconsole->jupyter->py3Dmol) (18.1.1) Requirement already satisfied: qtpy in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from qtconsole->jupyter->py3Dmol) (1.9.0) Requirement already satisfied: ipython-genutils in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from qtconsole->jupyter->py3Dmol) (0.2.0) Requirement already satisfied: appnope; platform_system == "Darwin" in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipykernel->jupyter->py3Dmol) (0.1.0) Requirement already satisfied: tornado>=4.2 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipykernel->jupyter->py3Dmol) (6.0.4) Requirement already satisfied: widgetsnbextension~=3.5.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipywidgets->jupyter->py3Dmol) (3.5.1) Requirement already satisfied: Send2Trash in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from notebook->jupyter->py3Dmol) (1.5.0) Requirement already satisfied: prometheus-client in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from notebook->jupyter->py3Dmol) (0.7.1) Requirement already satisfied: terminado>=0.8.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from notebook->jupyter->py3Dmol) (0.8.3) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter-console->jupyter->py3Dmol) (3.0.4) Requirement already satisfied: backcall in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (0.1.0) Requirement already satisfied: pickleshare in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (0.7.5) Requirement already satisfied: decorator in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (4.4.2) Requirement already satisfied: pexpect; sys_platform != "win32" in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (4.8.0) Requirement already satisfied: jedi>=0.10 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (0.15.2) Requirement already satisfied: setuptools>=18.5 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from ipython>=1.0->idisplay->py3Dmol) (46.1.3.post20200330) Requirement already satisfied: six>=1.9.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from bleach->nbconvert->jupyter->py3Dmol) (1.14.0) Requirement already satisfied: webencodings in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from bleach->nbconvert->jupyter->py3Dmol) (0.5.1) Requirement already satisfied: MarkupSafe>=0.23 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jinja2>=2.4->nbconvert->jupyter->py3Dmol) (1.1.1) Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from nbformat>=4.4->nbconvert->jupyter->py3Dmol) (3.2.0) Requirement already satisfied: python-dateutil>=2.1 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jupyter-client>=4.1->qtconsole->jupyter->py3Dmol) (2.8.1) Requirement already satisfied: wcwidth in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->jupyter-console->jupyter->py3Dmol) (0.1.9) Requirement already satisfied: ptyprocess>=0.5 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from pexpect; sys_platform != "win32"->ipython>=1.0->idisplay->py3Dmol) (0.6.0) Requirement already satisfied: parso>=0.5.2 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jedi>=0.10->ipython>=1.0->idisplay->py3Dmol) (0.5.2) Requirement already satisfied: attrs>=17.4.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert->jupyter->py3Dmol) (19.3.0) Requirement already satisfied: importlib-metadata; python_version < "3.8" in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert->jupyter->py3Dmol) (1.5.0) Requirement already satisfied: pyrsistent>=0.14.0 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert->jupyter->py3Dmol) (0.16.0) Requirement already satisfied: zipp>=0.5 in /Users/jordanwillis/anaconda3/envs/pyrosetta/lib/python3.6/site-packages (from importlib-metadata; python_version < "3.8"->jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert->jupyter->py3Dmol) (3.1.0)
Kinematic closure algorithms were originally developed for the robotics field to solve the problem of determining the necessary joint angles that would place a robot's hand or foot in a desired place. We have adapted them for use within the Rosetta software suite to sample conformations of chains of atoms with well-defined start and end points.
A molecular kinematic closure problem may be described as follows: given a covalently-contiguous chain of atoms within a molecule, with covalent linkages at the ends of the chain fixing the start and end of the chain, what possible conformations maintain the integrity of bond length, bond angle, and dihedral angle restrictions within the chain? To solve such a problem, we divide the chain of atoms into two segments, and define "pivot points" at the start of the chain (the first pivot), the end of the chain (the last pivot), and the breakpoint between the two segments (the middle pivot).
Having done this, the degrees of freedom within the two segments may be held fixed, randomized, perturbed, or otherwise altered as one sees fit. These degrees of freedom include bond lengths, bond angles, and dihedral angles. Whatever one does to these degrees of freedom, one ends up with two segments that still have well-defined rigid body transforms from the first pivot to the middle pivot (in the first segment), and from the middle pivot to the last pivot (in the second segment). It is then possible to solve a system of equations for the six torsion angles adjacent to the three pivots in order to keep the system closed. The matrix math that gives rise to the solution(s) is extremely fast as compared to alternative loop closure methods (which typically rely on iterative gradient-descent minimization); however, for a given system, this step may yield anywhere from 0 to 16 solutions. It then becomes necessary to choose a solution for downstream molecular design or conformational refinement.
The GeneralizedKIC mover in Rosetta gives a user full control over pre-closure sampling, post-closure filtering, and selection of a closure solution. It also allows closure of chains that do not consist solely of polypeptide backbones: that is, it is fully compatible with closure of atomic chains that run through disulfide bonds, arbitrary side-chain cross-links or cross-linkers, and non-canonical backbones.
GenKic is Extremely flexible. It can deifine entire topologies1,2 or rebuild small loops. Accordingly, we are defining lots of functions to show you how you would configure PyRosetta to accomplish a general GenKic task.
For this exercise, we will be using an NMR structure of an artificial mini-protein designed by Dr. Chris Bahl (PDB ID 2ND2). This mini-protein is a 44-residue 3-helix bundle. For the purposes of this tutorial, we will define two poses. One complete structure, and one structure has been stripped of its amino acid sequence (i.e. it has been mutated to poly-glycine), and the loop connecting the second and third helices has been deleted. This is meant to simulate many common design cases, in which one might arrange secondary structure elements first and build loops later (e.g. in the case of parametric design approaches), as well as certain structure prediction cases, in which one might wish to model loops that are missing in crystal structures. We will rebuild this loop and sample its possible conformations.
from pyrosetta import pose_from_pdb
input_pose = pose_from_pdb('inputs/2ND2_state1_glyonly.pdb')
input_pose_no_loop = pose_from_pdb('inputs/2ND2_state1_glyonly_loop_removed.pdb')
core.chemical.GlobalResidueTypeSet: {0} Finished initializing fa_standard residue type set. Created 980 residue types core.chemical.GlobalResidueTypeSet: {0} Total time to initialize 1.00681 seconds. core.import_pose.import_pose: {0} File 'inputs/2ND2_state1_glyonly.pdb' automatically determined to be of type PDB core.conformation.Conformation: {0} [ WARNING ] missing heavyatom: OXT on residue GLY:CtermProteinFull 44 core.import_pose.import_pose: {0} File 'inputs/2ND2_state1_glyonly_loop_removed.pdb' automatically determined to be of type PDB core.conformation.Conformation: {0} [ WARNING ] missing heavyatom: OXT on residue GLY:CtermProteinFull 28 core.conformation.Conformation: {0} [ WARNING ] missing heavyatom: OXT on residue GLY:CtermProteinFull 39
Let's view the two poses (notice that the second one is missing a loop):|
helix_selector = pyrosetta.rosetta.core.select.residue_selector.SecondaryStructureSelector("H")
loop_selector = pyrosetta.rosetta.core.select.residue_selector.SecondaryStructureSelector("L")
modules = [
viewer.setBackgroundColor(color="black"),
viewer.setStyle(residue_selector=helix_selector, cartoon_color="blue", label=False, radius=0),
viewer.setStyle(residue_selector=loop_selector, cartoon_color="yellow", label=False, radius=0),
viewer.setZoomTo(residue_selector=loop_selector)
]
view = viewer.init(input_pose, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
view = viewer.init(input_pose_no_loop, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
The GeneralizedKIC mover is only capable of sampling conformations of existing geometry. It can neither add amino acid residues to a pose, nor create new bonds between residues. For this reason, we will use PyRosetta to make all the changes to the pose, and then use GenKic to idealize the bond angles and lengths.
The first thing we have to do with PyRosetta, is make a pose that can be applied by GenKic. To do that we will have to manipulate the Pose. Let's first examine the break in the pose:
##The c terminus of one helix
print(input_pose_no_loop.residue(28).name())
#The N terminus of the other helix
print(input_pose_no_loop.residue(29).name())
GLY:CtermProteinFull GLY:NtermProteinFull
The input pose has a break between pose residue 28 and 29. First thing to do is change those terminal glycines to alanines. We do this for two reasons:
pose.join_residue_by_bond
method.def mutate_position(pose,position,mutate):
'''A simple function to mutate an amino acid given a pose number'''
mr = pyrosetta.rosetta.protocols.simple_moves.MutateResidue()
mr.set_target(position)
mr.set_res_name(mutate)
mr.apply(pose)
##Mutate both 28 and 29 to ALA
mutate_position(input_pose_no_loop,28,'ALA')
mutate_position(input_pose_no_loop,29,'ALA')
assert(input_pose_no_loop.residue(28).name() == 'ALA')
assert(input_pose_no_loop.residue(29).name() == 'ALA')
core.conformation.Residue: {0} [ WARNING ] Residue connection id changed when creating a new residue at seqpos 29 core.conformation.Residue: {0} [ WARNING ] ResConnID info stored on the connected residue (residue 30) is now out of date! core.conformation.Residue: {0} [ WARNING ] Connection atom name (in src): C core.chemical.AtomICoor: {0} [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ALA 29. Returning BOGUS ID instead. core.conformation.Residue: {0} [ WARNING ] missing an atom: 29 H that depends on a nonexistent polymer connection! core.conformation.Residue: {0} [ WARNING ] --> generating it using idealized coordinates. core.conformation.Residue: {0} [ WARNING ] Residue connection id changed when creating a new residue at seqpos 29 core.conformation.Residue: {0} [ WARNING ] ResConnID info stored on the connected residue (residue 30) is now out of date! core.conformation.Residue: {0} [ WARNING ] Connection atom name (in src): C
#Use the print statements above to check that we successfully mutated the residues:
The next step is to slice the pose into two seperate python pose objects and then we can easily join those objects with any number of arbitrary residues, to grow and shrink a loop as necessary
from pyrosetta import Pose
def slice_pose(p,start,end):
'''
Take a pose object and return from start, end
'''
sliced = Pose()
if end > p.size() or start > p.size():
return "end/start slice is longer than total lenght of pose {} {}".format(start,end)
for i in range(start,end+1):
sliced.append_residue_by_bond(p.residue(i))
return sliced
##Pose object 1 - helix_AB all the way up to residue 28
helix_ab_pose = slice_pose(input_pose_no_loop,1,28)
##Pose object 2 - helix C and the reaminder of the pose
helix_c_pose = slice_pose(input_pose_no_loop,29,input_pose_no_loop.size())
Since we created a new pose, we should apply some PDB info to it. It makes things easier when we want to view or manipulate the pose using PDB numbering in downstream steps
# We're just going to quicky add in pdb info so that our viewing commands work
add_pdb_info_mover = pyrosetta.rosetta.protocols.simple_moves.AddPDBInfoMover()
add_pdb_info_mover.apply(helix_ab_pose)
add_pdb_info_mover.apply(helix_c_pose)
# Here's the second part
view = viewer.init(helix_c_pose, window_size=(800, 600), modules=modules).show()
# Here's the first object
view = viewer.init(helix_ab_pose, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
# Here's the second object
view = viewer.init(helix_c_pose, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
Now that we have two pose objects with their own numbering, we can join them back together with any arbitrary sequence. We will make a function that takes in two pose objects and joins them by a sequence. It will not do anything kinematically to close the bond, but sets us up for GK.
def crudely_connect_w_loop(n_term_pose,c_term_pose,connect_with):
"""
The function will take two poses and join them with a loop
Keep in mind this is just joined as far as the pose is concerned. The bond angles and lenghts will be sub-optimal
"""
one_to_three = {
'A': 'ALA',
'C': 'CYS',
'D': 'ASP',
'E': 'GLU',
'F': 'PHE',
'G': 'GLY',
'H': 'HIS',
'I': 'ILE',
'K': 'LYS',
'L': 'LEU',
'M': 'MET',
'N': 'ASN',
'P': 'PRO',
'Q': 'GLN',
'R': 'ARG',
'S': 'SER',
'T': 'THR',
'V': 'VAL',
'Y': 'TYR',
'W': 'TRP'}
pose_a = Pose()
pose_a.assign(n_term_pose)
pose_b = Pose()
pose_b.assign(c_term_pose)
# Setup CHEMICAL MANAGER TO MAKE NEW RESIDUES
chm = pyrosetta.rosetta.core.chemical.ChemicalManager.get_instance()
rts = chm.residue_type_set('fa_standard')
get_residue_object = lambda x: pyrosetta.rosetta.core.conformation.ResidueFactory.create_residue(
rts.name_map(x))
# Will keep track of indexing of rebuilt loop
rebuilt_loop = []
'''Iterate through string turning each letter into a residue object and then
appending it to the N term pose'''
for one_letter in connect_with:
resi = get_residue_object(one_to_three[one_letter])
pose_a.append_residue_by_bond(resi, True)
pose_a.set_omega(pose_a.total_residue(), 180.)
rebuilt_loop.append(pose_a.total_residue())
##ADD the C term pose to the end of the loop we just appended
for residue_index in range(1, pose_b.total_residue()+1):
pose_a.append_residue_by_bond(
pose_b.residue(residue_index))
print("Joined NTerm and CTerm pose with loop {} at residues {}".format(connect_with,rebuilt_loop))
return pose_a
#Returns a pose that is connected, but sub-optimal geometry
gk_input_pose = crudely_connect_w_loop(helix_ab_pose,helix_c_pose,'AGAAA')
Joined NTerm and CTerm pose with loop AGAAA at residues [29, 30, 31, 32, 33]
We're now ready to add the GeneralizedKIC mover that will close the gap in the loop and sample loop conformations using PyRosetta. We first need to import the GenKic PyRosetta Class. Feel free to browse this library in the additional scripts
directory. It wraps a lot of the manual complexity of setting up a GenKic mover into an object based Python class.
The GenKic class constructor takes residue numbers to be considered in the GK mover. We add all the connecting loops from the previous step 29-33 plus two anchoring residues 28 and 34. H
from additional_scripts.GenKic import GenKic
##All that GenKic needs is the loop residue list
loop_residues = [28,29, 30, 31, 32, 33, 34]
gk_object = GenKic(loop_residues)
core.scoring.etable: {0} Starting energy table calculation core.scoring.etable: {0} smooth_etable: changing atr/rep split to bottom of energy well core.scoring.etable: {0} smooth_etable: spline smoothing lj etables (maxdis = 6) core.scoring.etable: {0} smooth_etable: spline smoothing solvation etables (max_dis = 6) core.scoring.etable: {0} Finished calculating energy tables. basic.io.database: {0} Database file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv basic.io.database: {0} Database file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv basic.io.database: {0} Database file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv basic.io.database: {0} Database file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv basic.io.database: {0} Database file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv basic.io.database: {0} Database file opened: scoring/score_functions/rama/fd/all.ramaProb basic.io.database: {0} Database file opened: scoring/score_functions/rama/fd/prepro.ramaProb basic.io.database: {0} Database file opened: scoring/score_functions/omega/omega_ppdep.all.txt basic.io.database: {0} Database file opened: scoring/score_functions/omega/omega_ppdep.gly.txt basic.io.database: {0} Database file opened: scoring/score_functions/omega/omega_ppdep.pro.txt basic.io.database: {0} Database file opened: scoring/score_functions/omega/omega_ppdep.valile.txt basic.io.database: {0} Database file opened: scoring/score_functions/P_AA_pp/P_AA basic.io.database: {0} Database file opened: scoring/score_functions/P_AA_pp/P_AA_n core.scoring.P_AA: {0} shapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated. basic.io.database: {0} Database file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop
The user has many, many options to control loop sampling, filtering, and solution selection. Typically, the first option that one wants to set is the number of attempts that the mover will make to find a closed solution. The default is 100000, but can be specified with a class member function
##Let's set the closure attempt to 500000
gk_object.set_closure_attempts(500000)
Each attempt could yield anywhere from 0 to 16 solutions. By default, every solution found will be stored until we've made the specified number of attempts (in our case 500000). This could be far too many solutions, though. It makes more sense to stop looking for solutions after we've found a small number. That number could be as low as 1 (i.e. GeneralizedKIC stops as soon as a solution is found), but for our purposes, let's set that number at 10. Once the KicMover has 10 closures that meet your criteria (defined below), then it will stop
gk_object.set_min_solutions(10)
Even if we had set this numer at 1, a single attempt might have yielded up to 16 solutions. We always need to tell GeneralizedKIC how to pick a single solution from among the solutions. Here, we'll choose our solution by energy -- but there is an important caveat. In this example, we are sampling backbone conformations, with no consideration of side-chains, we should use a scoring function that consists primarily of backbone-only terms to pick the best solution.. Let's set up a backbone-only scoring function using weights from the ref2015 scoring function.
from pyrosetta import ScoreFunction
def get_bb_only_sfxn():
scorefxn = ScoreFunction()
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.fa_atr, 1) # full-atom attractive score
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.fa_rep, 0.55) # full-atom repulsive score
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.hbond_sr_bb, 1) # short-range hbonding
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.hbond_lr_bb, 1) # long-range hbonding
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.rama_prepro, 0.45) # ramachandran score
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.omega, 0.4) # omega torsion score
scorefxn.set_weight(pyrosetta.rosetta.core.scoring.p_aa_pp, 0.625)
return scorefxn
##Grab BB Only SFXN
bb_only_sfxn = get_bb_only_sfxn()
##Pass it to GK
gk_object.set_scorefxn(bb_only_sfxn)
Let's tell GeneralizedKIC to use this scoring function to pick a solution by adding this selector. Once we have 10 solutions, it will return the lowest energy solution defined by our score function
gk_object.set_selector_type('lowest_energy_selector')
Perturbers allow a user to alter degrees of freedom in the two segments between the pivots. They can:
Perturbers are applied in the order in which they are defined, and can override or modify the effect of previous perturbers. One could, for example, set a particular torsion value to 180, then allow small perturbations around that value, through successive application of a setting and a perturbing perturber.
We want to use perturbers to do three things:
Set all mainchain omega values to 180 degrees.
Randomize phi and psi values for all amino acids in the loop, biased by each amino acid's Ramachandran map.
Set the bond length, bond angles, and torsion angle of the currently-broken bond between residues 33 and 34 to ideal values for a peptide bond.
First lets set all omega angles to 180:
#First lets set alll mainchain omega values to 180 degrees in our loop. We don't want to include residue after the last anchor residue as that could potentially not exist.
for res_num in loop_residues[:-1]:
gk_object.set_dihedral(res_num, res_num + 1, "C", "N", 180.1)
###Or there is a convienience function within the class that does the same thing
gk_object.set_omega_angles()
Next, let's randomize phi and psi values for all amino acids in the loop, biased by each amino acid's Ramachandran map. The randomize_backbone_by_rama_prepro
perturber can be used for biased randomization of mainchain torsions of any residue that
(a) has a Ramachandran map in the Rosetta database, and
(b) has all of its mainchain torsions within the chain tmo be sampled by GeneralizedKIC. (That is, we cannot use it for, for example, a cysteine residue involved in a disulfide bond if we are closing through the disulfide bond.)
for res_num in loop_residues:
gk_object.randomize_backbone_by_rama_prepro(res_num)
Finally, we want to let GK know about the broken bond we set between our poses which exists between residue 33 and 34 (the anchor residue and the last residue in our loop definition). But just for fun, let's see what happens if you don't run it with the close bond tag.
##This will grab the GK instance and apply everything we have set to our pose
gk_object.get_instance().apply(gk_input_pose)
protocols.generalized_kinematic_closure.GeneralizedKIC: {0} [ WARNING ] No filters specified for GeneralizedKIC mover. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 2 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 3 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 4 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} [ WARNING ] No filters specified for GeneralizedKIC mover. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 2 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 3 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 4 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 5 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 6 from attempt 3. protocols.generalized_kinematic_closure.selector.GeneralizedKICselector: {0} Choosing GeneralizedKIC solution. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Closure successful.
##You can see, we perturbed the loop, but we did not tell GK to close the bond
view = viewer.init(gk_input_pose, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
Now let's actually set the close bond
gk_object.close_normal_bond(33,34) #or gk_object.close_normal_bond(loop_residues[-2],loop_residues[-1])
Now that we have setup all perturbers and filters, GK knows exactly how to close this loop. We can grab the instance from the Python object, and apply it to our pose
gk_object.get_instance().apply(gk_input_pose)
protocols.generalized_kinematic_closure.GeneralizedKIC: {0} [ WARNING ] No filters specified for GeneralizedKIC mover. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 2 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 3 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 4 from attempt 1. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} [ WARNING ] No filters specified for GeneralizedKIC mover. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 2. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 2 from attempt 2. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} [ WARNING ] No filters specified for GeneralizedKIC mover. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 2 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 3 from attempt 3. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 4 from attempt 3. protocols.generalized_kinematic_closure.selector.GeneralizedKICselector: {0} Choosing GeneralizedKIC solution. protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Closure successful.
view = viewer.init(gk_input_pose, window_size=(800, 600), modules=modules).show()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
You can see that GK has succefully closed our loop
There are a number of possible problems with the solutions produced. First, although the residues within each segment are being sampled in a biased manner based on their respective Ramachandran maps, the pivot residues have values assigned to them by the solver, which may put them in awkward regions of Ramachandran space. We want to filter out solutions with poor pivot Ramachandran energies.
Second, we don't want loop solutions with clashing geometry, so we want some sort of bump check to be applied before accepting a solution.
And third, we may want to impose some prior knowledge insofar as we expect the first and last residues of the loop, which are coming off of helices, to be in the alpha-helical bin ("A") of Ramachandran space.
GeneralizedKIC filters are applied rapidly to all solutions produced by the KIC solver before the solutions are used to build computationally-expensive Pose geometry. They are therefore a good way to cheaply and efficiently discard bad solutions. Note that, unlike Rosetta's filters, the GeneralizedKIC filters operate on a set of loop degree-of-freedom values, not on a full Pose.
Let us first require that solutions have residues 28 ad 34 in the alpha-helical region of Ramachandran space. For this, we use a backbone_bin GeneralizedKIC filter: The "ABBA" bin parameters file, located in the Rosetta database, defines Ramachandran bins for the alpha-helical region ("A"), the beta-sheet region ("B"), and the mirror-image regions that can be accessed by D-amino acids ("Aprime" and "Bprime", respectively).
##The first residue in our loop definition will be confiend to alpha-helical rama space
gk_object.set_filter_backbone_bin(loop_residues[0],'A',bin='ABBA')
##The last residue in our loop definition will be confiend to alpha-helical rama space
gk_object.set_filter_backbone_bin(loop_residues[-1],'A',bin='ABBA')
protocols.generalized_kinematic_closure.filter.GeneralizedKICfilter: {0} Creating BinTransitionCalculator. protocols.generalized_kinematic_closure.filter.GeneralizedKICfilter: {0} Loading bin_params file ABBA. basic.io.database: {0} Database file opened: protocol_data/generalizedKIC/bin_params/ABBA.bin_params core.scoring.bin_transitions.BinTransitionCalculator: {0} Opened file protocol_data/generalizedKIC/bin_params/ABBA.bin_params for read. core.scoring.ramachandran: {0} shapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated. basic.io.database: {0} Database file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/avg_L_rama.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_all_rama.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_G_rama.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_P_rama.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/avg_L_rama_str.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama_str.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_all_rama_str.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama_str.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_G_rama_str.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama_str.dat. basic.io.database: {0} Database file opened: scoring/score_functions/rama/flat/sym_P_rama_str.dat core.scoring.ramachandran: {0} Reading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama_str.dat. core.scoring.bin_transitions.BinTransitionCalculator: {0} Finished read of protocol_data/generalizedKIC/bin_params/ABBA.bin_params. protocols.generalized_kinematic_closure.filter.GeneralizedKICfilter: {0} Creating BinTransitionCalculator. protocols.generalized_kinematic_closure.filter.GeneralizedKICfilter: {0} Loading bin_params file ABBA. basic.io.database: {0} Database file opened: protocol_data/generalizedKIC/bin_params/ABBA.bin_params core.scoring.bin_transitions.BinTransitionCalculator: {0} Opened file protocol_data/generalizedKIC/bin_params/ABBA.bin_params for read. core.scoring.bin_transitions.BinTransitionCalculator: {0} Finished read of protocol_data/generalizedKIC/bin_params/ABBA.bin_params.
Next we'll add a simple bump check filter (loop_bump_check) to discard solutions with clashing mainchain geometry. Note that this does not check sidechain geometry; it only operates on the heavyatoms of the loop to be closed
gk_object.set_filter_loop_bump_check()
And finally, we'll add a rama_prepro_check filter to discard solutions in which pivot atoms are in energetically-unfavourable regions of Ramachandran space, i.e their rama energy term is high:
for r in gk_object.pivot_residues:
gk_object.set_filter_rama_prepro(r,cutoff=0.5)
##Grab GK instance
gk_instance = gk_object.get_instance()
##apply it to the pose
gk_instance.apply(gk_input_pose)
protocols.generalized_kinematic_closure.GeneralizedKIC: {0} Storing solution 1 from attempt 293.
Notice that with filters applied, it takes many more attempts to get a succesful closure that meets our criteria of a "good" loop. We can up the sampling by setting min solutions from 10 to a higher number.
##The Input pose with no loop, the reference pose we are trying to recreate and the GK pose
poses = [input_pose_no_loop, input_pose, gk_input_pose]
view = viewer.init(poses) + viewer.setStyle()
view()