Keywords: HBNet, OperateOnResidueSubset, getPoseExtraScore, InterGroupInterfaceByVectorSelector, ChainSelector, PreventRepackingRLT, RestrictToRepackingRLT, OperateOnResidueSubset, ResiduePDBInfoHasLabelSelector, PackRotamersMover
Sometimes in Rosetta we want to run implicit multistage design. That is, we want to optimize one conformation while implicitly modeling another (either negatively or positively). There are many ways to accomplish this depending on your interests. In this section we will look at HBNet, a tool for explicitly designing hydrogen bond networks.
One negative-design approach is to implicitly model binding specificity. Designing a complicated network of hydrogen bonds at one interface will implicitly destabilize other interfaces. Hydrogen bonds are so sensitive to geometry that competing interfaces are unlikely to be able to "satisfy" the network well enough to remain competetive.
The previous example can also be viewed through the implicit positive-design lens as well. We often find that Rosetta designs very hydrophobic interfaces (especially with newer score functions). Running HBNet before the traditional design protocols can boost the polar residue concentration of your interface in exchange for a small cost packing quality. In other words, we can implicitly stabilize the unbound state by running HBNet, but it might mildly destabilize the bound state.
Our experience shows that it is useful to run both with and without HBNet, depending on your design case. It is possible that the default design protocol handles your implicits states well enough. When that fails, though, there is not much to do to fix it other than to run pre-design protocols like HBNet. An added benefit of HBNet is that it can provide "seeds" for packing, which can influence design diversity if nothing else.
# Notebook setup import sys if 'google.colab' in sys.modules: !pip install pyrosettacolabsetup import pyrosettacolabsetup pyrosettacolabsetup.mount_pyrosetta_install() print ("Notebook is set for PyRosetta use in Colab. Have fun!")
Make sure you are in the directory with the pdb files:
cd google_drive/My\ Drive/student-notebooks/
# From previous section: from pyrosetta import * from pyrosetta.teaching import * pyrosetta.init("-mute core -mute basic") print( "init complete" )
PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.31+release.9a323bc72ca18d3abdc8b1a730b37e52197e4ceb 2019-07-29T16:16:04] retrieved from: http://www.pyrosetta.org (C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team. init complete
We prepare for HBNet the same way that we prepare for packing. We setup the pose and score function as before...
pose = pose_from_pdb("inputs/hbnet_example.pdb") start_pose = Pose() start_pose.assign(pose) scorefxn = get_fa_scorefxn() print( "setup complete" )
Just like before, you can edit the resfile to your own personal specifications. Alternatively, you can use task operations to automate the process. Let's use task operations to fix all residues not at the interface.
from pyrosetta.rosetta.core.select.residue_selector import InterGroupInterfaceByVectorSelector, ChainSelector, NotResidueSelector chain1 = ChainSelector( "1" ) #selects the first chain chain2 = ChainSelector( "2" ) #selects the second chain interface_selector = InterGroupInterfaceByVectorSelector( chain1, chain2 );#selects residues at the interface not_interface_selector = NotResidueSelector( interface_selector ); #selects residues not at the interface from pyrosetta.rosetta.core.pack.task.operation import PreventRepackingRLT, RestrictToRepackingRLT, OperateOnResidueSubset #prevent non interface residues from repacking/designing fix_non_interface = OperateOnResidueSubset( PreventRepackingRLT(), not_interface_selector ) #perhaps we are performing one-sided design and do not want to make mutations on chain 2: no_mutation_chain2 = OperateOnResidueSubset( RestrictToRepackingRLT(), chain2 ) from pyrosetta.rosetta.core.pack.task import TaskFactory task_factory = TaskFactory() task_factory.push_back( fix_non_interface ) task_factory.push_back( no_mutation_chain2 ) task_design = task_factory.create_task_and_apply_taskoperations( pose ) print( "Num residues: ", pose.size() ) print( "Num packable residues: ", task_design.num_to_be_packed() ) # this includes the ones being designed num_designable = 0 for i in range( 1, pose.size() + 1 ): if( task_design.design_residue( i ) ): num_designable += 1; print( "Num designable residues: ", num_designable )
Num residues: 454 Num packable residues: 116 Num designable residues: 53
This is an interface case so we will use HBNetStapleInterface. We will use both the code-level interface, and the XML interface as an introduction to this functionality. The XML interface to PyRosetta will be covered more in later workshops.
from pyrosetta.rosetta.protocols.hbnet import HBNetStapleInterface #This is similar to the clone operation, # but instead of copying the original pose, we copy the dtat of start pose INTO the pose pose.assign(start_pose) # There are two ways to setup a mover: # (1) by creating a blank instance of that mover and individually setting any non-default features # (2) by passing an XML string, similar to rosetta_scripts # We will be using option 2 because it provides a more consistent interface to the movers # Plus, the store_network_scores_in_pose feature is only available using option 2 for versions of PyRosetta older than September-ish 2019 setup_using_string = True #Option 2 if setup_using_string: hbnet = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string(""" <MOVERS> <HBNetStapleInterface name="hbnet" monte_carlo="true" store_network_scores_in_pose="true"/> </MOVERS> """).get_mover("hbnet") #monte_carlo="true" is highly recommended, especially for large systems like asymmetric interfaces #see PMID: 29652499 else: hbnet = HBNetStapleInterface() #This is highly recommended, especially for large systems like asymmetric interfaces #see PMID: 29652499 hbnet.set_monte_carlo_branch( True ) #set_monte_carlo_seed_must_be_buried does two things: #(1) speeds us up by decreasing the sample space #(2) ensures that our final hbond network will be at least partially buried #hbnet.set_monte_carlo_seed_must_be_buried( True ) #commenting this out just to give us more results #You can toggle the verbosity here if you are interested #hbnet.verbose( True ) #We can normallly leave this as the default #making it smaller now to let it run faster hbnet.set_total_num_mc_runs( 1000 ) #Could have been done with #<HBNetStapleInterface name="hbnet" monte_carlo="true" total_num_mc_runs="1000" store_network_scores_in_pose="true"/> hbnet.task_factory( task_factory ) #alternatively: #hbnet.set_task( task_design ) hbnet.set_score_function( scorefxn ) hbnet.apply(pose) print( "Change in score", scorefxn(pose) - scorefxn(start_pose) )
protocols.rosetta_scripts.RosettaScriptsParser: Generating XML Schema for rosetta_scripts... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Initializing schema validator... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Validating input script... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Parsed script: <ROSETTASCRIPTS> <MOVERS> <HBNetStapleInterface monte_carlo="true" name="hbnet" store_network_scores_in_pose="true"/> </MOVERS> <PROTOCOLS/> </ROSETTASCRIPTS> protocols.rosetta_scripts.RosettaScriptsParser: Defined mover named "hbnet" of type HBNetStapleInterface protocols.rosetta_scripts.ParsedProtocol: ParsedProtocol mover with the following movers and filters protocols.hbnet.HBNet: Creating packer task based on specified task operations... protocols.hbnet.HBNet: Creating packer task based on specified task operations... protocols.hbnet.HBNet: built 9732 rotamers at 84 positions. protocols.hbnet.HBNet: IG: 1534084 bytes protocols.hbnet.HBNet: ================================================================== protocols.hbnet.HBNet: ============ PERFORMING H-BOND NETWORK DESIGN ============ protocols.hbnet.HBNet: ================================================================== protocols.hbnet.HBNet: starting monte carlo branching protocol protocols.hbnet.HBNet: 10% done with branching protocols.hbnet.HBNet: 20% done with branching protocols.hbnet.HBNet: 30% done with branching protocols.hbnet.HBNet: 40% done with branching protocols.hbnet.HBNet: 50% done with branching protocols.hbnet.HBNet: 60% done with branching protocols.hbnet.HBNet: 70% done with branching protocols.hbnet.HBNet: 80% done with branching protocols.hbnet.HBNet: 90% done with branching protocols.hbnet.HBNet: 100% done with branching protocols.hbnet.HBNet: Monte carlo branching protocol found 196 networks. protocols.hbnet.HBNet: NUMBER OF H-BOND NETWORKS AFTER BRANCH: 134 protocols.hbnet.HBNet: NUMBER OF NETWORKS AFTER REMOVE REPLICATES = 131 protocols.hbnet.HBNet: NUMBER OF NETWORKS AFTER SELECTION = 10 protocols.hbnet.HBNet: NUMBER OF NETWORKS AFTER REMOVE_REP = 10 protocols.hbnet.HBNet: Designed these new networks that meet your criteria: protocols.hbnet.HBNet: #HBNet_rank residues size score num_hbonds percent_hbond_capacity num_unsat_Hpol interf_hbs protocols.hbnet.HBNet: #network_1 D_Y_155,D_R_180,D_E_185,D_R_187,E_R_207,backbone 5 -0.650077 8 0.714286 0 3 protocols.hbnet.HBNet: #network_2 D_Y_155,D_E_185,D_R_187,E_R_207,backbone 4 -1.07079 6 0.6875 0 3 protocols.hbnet.HBNet: #network_3 D_S_154,D_S_189,E_R_207 3 2.18313 4 0.666667 0 2 protocols.hbnet.HBNet: #network_4 D_Y_155,D_R_180,D_E_185,E_R_207 4 -0.93091 5 0.625 0 3 protocols.hbnet.HBNet: #network_5 D_Y_33,D_R_40,D_H_50,D_E_116,E_R_107 5 1.32341 6 0.611111 0 1 protocols.hbnet.HBNet: #network_6 D_Y_33,D_R_40,D_Q_44,D_H_50,D_E_116,E_R_107 6 0.893853 7 0.590909 0 1 protocols.hbnet.HBNet: #network_7 D_Y_33,D_R_40,D_Q_44,D_H_50,D_E_116,E_Q_44,E_R_107 7 0.71998 8 0.576923 0 2 protocols.hbnet.HBNet: #network_8 D_Y_33,D_R_40,D_E_116,E_R_107 4 1.33101 5 0.5625 0 1 protocols.hbnet.HBNet: #network_9 D_H_116,E_S_40,E_R_107 3 2.49845 3 0.555556 0 1 protocols.hbnet.HBNet: #network_10 D_Y_155,D_E_185,E_R_207 3 -0.259134 3 0.545455 0 3 protocols.evaluation.ChiWellRmsdEvaluatorCreator: Evaluation Creator active ... protocols.hbnet.HBNet: Change in score 1209.2788032525623
Wait, my score is terrible.
Well of course the score is terrible, the pose is dense with clashes. We had 116 packable residues and only assigned states to 5 of them. The other 111 residues are still in their input conformations and likely clash with the 5 we just assigned.
We need to run the packer (either using PackRotamersMover or FastDesign) but we don't want to overwrite the residues we just assigned with HBNet. The trick here is to select the residues with "HBNet" labels and fix them.
from pyrosetta.rosetta.core.select.residue_selector import ResiduePDBInfoHasLabelSelector #prevent hbnet residues from repacking/designing hbnet_selector = ResiduePDBInfoHasLabelSelector( "HBNet" ) fix_hbnet = OperateOnResidueSubset( PreventRepackingRLT(), hbnet_selector ) task_factory.push_back( fix_hbnet ) #recycling the same factory as before, just adding a new operation task_design2 = task_factory.create_task_and_apply_taskoperations( pose ) #sanity check num_hbnet_residues = 0 for x in hbnet_selector.apply( pose ): if x: num_hbnet_residues += 1 print( "Num HBNet Residues", num_hbnet_residues ) #this is unrelated to the narrative but I highly recommend using the linear memory interaction graph whenever performing design. It's a huge speedup #it does not seem to matter for the scope here, but it will when you start using extra chi sampling (-ex1, -ex2) task_design2.or_linmem_ig( True ) from pyrosetta.rosetta.protocols.minimization_packing import PackRotamersMover pack_mover = PackRotamersMover( scorefxn, task_design2 ) pack_mover.apply( pose ) print( "Change in score", scorefxn(pose) - scorefxn(start_pose) )
Num HBNet Residues 5 Change in score 158.01310006922307
The change in score is a better, but still positive. One great thing about HBNet is that it can return multiple poses. Each one is slightly worse than the previous by HBNet's standards but might design into something better. Let's try a few to see if they help.
#there were 10 (or so) networks total, but let's just try the next 5 #this might take a few minutes... if not os.getenv('DEBUG'): #Adding this line to decrease runtime on the testing server for x in range(0,5): extra_pose = hbnet.get_additional_output() if extra_pose is None: break task_design3 = task_factory.create_task_and_apply_taskoperations( extra_pose ) task_design3.or_linmem_ig( True ) pack_mover = PackRotamersMover( scorefxn, task_design3 ) pack_mover.apply( extra_pose ) print( "Change in score", scorefxn(extra_pose) - scorefxn(start_pose) )
protocols.hbnet.HBNet: core.pack.pack_rotamers: built 12700 rotamers at 112 positions. core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph Change in score 208.0715304180745 protocols.hbnet.HBNet: core.pack.pack_rotamers: built 12890 rotamers at 113 positions. core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph Change in score -60.13706595618231 protocols.hbnet.HBNet: core.pack.pack_rotamers: built 12595 rotamers at 111 positions. core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph Change in score 62.89829311472579 protocols.hbnet.HBNet: core.pack.pack_rotamers: built 11897 rotamers at 111 positions. core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph Change in score -45.536073668849326 protocols.hbnet.HBNet: core.pack.pack_rotamers: built 11684 rotamers at 110 positions. core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph Change in score -31.404979911851626
Great! We found a few results that designed to me more stable than the input pose (-60, -45, and -31 REU)!
The main score function is not the only way to evaluate these networks. HBNet also adds its own score terms. These are useful for sorting/filtering decoys before running expensive packing calculations.
from pyrosetta.rosetta.core.pose import hasPoseExtraScore, getPoseExtraScore if hasPoseExtraScore( pose, "HBNet_NumUnsatHpol" ): #All 3 of these metrics are explained in more detail in Maguire, Boyken, et al. (see second reference below) #NumUnsatHpol is HBNet's primary sorting metric, it counts the number of polar hydrogen atoms that are unsatisfied (buried and not forming hbonds). We know that there are no heavy (non-hydrogen) unsatisfied atoms because HBNet filters those out by default. Lower is better print( "HBNet_NumUnsatHpol", getPoseExtraScore( pose, "HBNet_NumUnsatHpol" ) ) #HBNet's secondary sorting metric. 1.0 if every polar atom in the network is forming the maximum number of hbonds. Higher is better print( "HBNet_Saturation", getPoseExtraScore( pose, "HBNet_Saturation" ) ) #HBNet's tertiary sorting metric. A little complicated but lower is better. print( "HBNet_Score", getPoseExtraScore( pose, "HBNet_Score" ) ) else: print( "Somebody go bug a developer to enable this feature for PyRosetta" )
HBNet_NumUnsatHpol 0.0 HBNet_Saturation 0.7142857313156128 HBNet_Score -0.6500769257545471
HBonds are very sensitive to sidechain sampling resolution. I highly recommend using -ex1 and -ex2. You can do this by adding:
ex1ex2 = ExtraRotamersGeneric() ex1ex2.ex1( True ) ex1ex2.ex2( True ) task_factory.push_back( ex1ex2 )
As we saw in the exercise, the first result out of HBNet does not always wind up being the best. Try designing with a few results from
hbnet.get_additional_output() to get more coverage of the design space. For the commandline users reading this, this functionality can also be accessed via
multistage_rosetta_scripts or the
rosetta_scripts. See the
rosetta_scripts_scripts repository for examples.
I highly recommend playing with the
set_monte_carlo_seed_must_be_buried mentioned above. Without it, HBNet tends to just design many surface networks that nobody really cares about.
The energy of HBNet+Design is often less favorable that the energy after an equivalent design run without HBNet. Why do people still use HBNet?