Keywords: score function, ScoreFunction(), get_score_function(), set_weight(), show(), etable_atom_pair_energies(), Atom objects, get_hbonds(), nhbonds(), residue_hbonds()
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
import pyrosetta
pyrosetta.init()
Make sure you are in the directory with the pdb files:
cd google_drive/MyDrive/student-notebooks/
In this module, we will explore the PyRosetta score function interface. You will learn to inspect energies of a biomolecule at the whole protein, per-residue, and per-atom level. Finally, you will gain practice applying the energies to answering biological questions involving proteins. For these exercises, we will use the protein Ras (PDB 6q21). Either make sure you have the PDB file "6Q21_A.pdb" in your current directory, or if you have an Internet connection, load the pdb into a pose called ras
with the pyrosetta.pose_from_pdb method.
### BEGIN SOLUTION
ras = pyrosetta.pose_from_pdb("inputs/6Q21_A.pdb")
### END SOLUTION
core.init: Checking for fconfig files in pwd and ./rosetta/flags core.init: Rosetta version: PyRosetta4.Release.python36.mac r208 2019.04+release.fd666910a5e fd666910a5edac957383b32b3b4c9d10020f34c1 http://www.pyrosetta.org 2019-01-22T15:55:37 core.init: command: PyRosetta -ex1 -ex2aro -database /Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database core.init: 'RNG device' seed mode, using '/dev/urandom', seed=-331629677 seed_offset=0 real_seed=-331629677 core.init.random: RandomGenerator:init: Normal mode, seed=-331629677 RG_type=mt19937 core.chemical.GlobalResidueTypeSet: Finished initializing fa_standard residue type set. Created 696 residue types core.chemical.GlobalResidueTypeSet: Total time to initialize 1.22777 seconds. core.import_pose.import_pose: File '6Q21_A.pdb' automatically determined to be of type PDB
To score a protein, you will begin by defining a score function using the get_score_function(is_fullatom: bool)
method in the pyrosetta.teaching
namespace. Specifying True
will return the default ref2015
all-atom energy function, while False
will specify the default centroid score function.
Create a PyRosetta score function using:
sfxn = get_score_function(True)
from pyrosetta.teaching import *
### BEGIN SOLUTION
sfxn = get_score_function(True)
### END SOLUTION
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.etable: Starting energy table calculation
core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
/Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/teaching.py:13: UserWarning: Import of 'rosetta' as a top-level module is deprecated and may be removed in 2018, import via 'pyrosetta.rosetta'. from rosetta.core.scoring import *
core.scoring.etable: Finished calculating energy tables. basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv basic.io.database: Database file opened: scoring/score_functions/rama/fd/all.ramaProb basic.io.database: Database file opened: scoring/score_functions/rama/fd/prepro.ramaProb basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.all.txt basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.gly.txt basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.pro.txt basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.valile.txt basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_n core.scoring.P_AA: shapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated. basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop
You can see the terms, weights, and energy method options by printing the score function:
print(sfxn)
### BEGIN SOLUTION
print(sfxn)
### END SOLUTION
ScoreFunction::show(): weights: (fa_atr 1) (fa_rep 0.55) (fa_sol 1) (fa_intra_rep 0.005) (fa_intra_sol_xover4 1) (lk_ball_wtd 1) (fa_elec 1) (pro_close 1.25) (hbond_sr_bb 1) (hbond_lr_bb 1) (hbond_bb_sc 1) (hbond_sc 1) (dslf_fa13 1.25) (omega 0.4) (fa_dun 0.7) (p_aa_pp 0.6) (yhh_planarity 0.625) (ref 1) (rama_prepro 0.45) energy_method_options: EnergyMethodOptions::show: aa_composition_setup_files: EnergyMethodOptions::show: mhc_epitope_setup_files: EnergyMethodOptions::show: netcharge_setup_files: EnergyMethodOptions::show: aspartimide_penalty_value: 25 EnergyMethodOptions::show: etable_type: FA_STANDARD_DEFAULT analytic_etable_evaluation: 1 EnergyMethodOptions::show: method_weights: ref 1.32468 3.25479 -2.14574 -2.72453 1.21829 0.79816 -0.30065 2.30374 -0.71458 1.66147 1.65735 -1.34026 -1.64321 -1.45095 -0.09474 -0.28969 1.15175 2.64269 2.26099 0.58223 EnergyMethodOptions::show: method_weights: free_res EnergyMethodOptions::show: unfolded_energies_type: UNFOLDED_SCORE12 EnergyMethodOptions::show: split_unfolded_label_type: SPLIT_UNFOLDED_MM EnergyMethodOptions::show: split_unfolded_value_type: SPLIT_UNFOLDED_BOLTZ EnergyMethodOptions::show: atom_vdw_atom_type_set_name: centroid EnergyMethodOptions::show: exclude_protein_protein_fa_elec: false EnergyMethodOptions::show: exclude_RNA_RNA_fa_elec: false EnergyMethodOptions::show: exclude_RNA_protein_fa_elec: false EnergyMethodOptions::show: exclude_monomer_fa_elec: false EnergyMethodOptions::show: elec_max_dis: 5.5 EnergyMethodOptions::show: elec_min_dis: 1.6 EnergyMethodOptions::show: elec_die: 10 EnergyMethodOptions::show: elec_no_dis_dep_die: false EnergyMethodOptions::show: elec_sigmoidal_die: true EnergyMethodOptions::show: elec_sigmoidal_D: 80 EnergyMethodOptions::show: elec_sigmoidal_D0: 6 EnergyMethodOptions::show: elec_sigmoidal_S: 0.4 EnergyMethodOptions::show: smooth_fa_elec: true EnergyMethodOptions::show: grpelec_fade_type: false EnergyMethodOptions::show: grpelec_fade_param1: 1 EnergyMethodOptions::show: grpelec_fade_param2: 1 EnergyMethodOptions::show: grpelec_fade_hbond: 0 EnergyMethodOptions::show: grp_cpfxn: 1 EnergyMethodOptions::show: elec_group_file: /scoring/score_functions/elec_group_def.dat EnergyMethodOptions::show: grpelec_context_dependent: 0 EnergyMethodOptions::show: use_polarization: true EnergyMethodOptions::show: use_gen_kirkwood: true EnergyMethodOptions::show: protein_dielectric: 1 EnergyMethodOptions::show: water_dielectric: 78.3 EnergyMethodOptions::show: exclude_DNA_DNA: true EnergyMethodOptions::show: exclude_intra_res_protein: false EnergyMethodOptions::show: count_pair_hybrid: false EnergyMethodOptions::show: put_intra_into_total: false EnergyMethodOptions::show: geom_sol_interres_path_distance_cutoff: false EnergyMethodOptions::show: geom_sol_intrares_path_distance_cutoff: true EnergyMethodOptions::show: eval_intrares_elec_ST_only: false EnergyMethodOptions::show: envsmooth_zero_negatives: false EnergyMethodOptions::show: cst_max_seq_sep: 18446744073709551615 EnergyMethodOptions::show: pb_bound_tag: bound EnergyMethodOptions::show: pb_unbound_tag: unbound EnergyMethodOptions::show: ordered_wat_penalty: 1.221 EnergyMethodOptions::show: ordered_pt_wat_penalty: 2.709 EnergyMethodOptions::show: voids_penalty_energy_containing_cones_cutoff_:6 EnergyMethodOptions::show: voids_penalty_energy_cone_distance_cutoff_: 8 EnergyMethodOptions::show: voids_penalty_energy_cone_dotproduct_cutoff_: 0.1 EnergyMethodOptions::show: voids_penalty_energy_voxel_grid_padding_: 1 EnergyMethodOptions::show: voids_penalty_energy_voxel_size_: 0.5 EnergyMethodOptions::show: voids_penalty_energy_disabled_except_during_packing_: TRUE EnergyMethodOptions::show: hbnet_bonus_ramping_function_: "quadratic" EnergyMethodOptions::show: hbnet_max_network_size_: 0 EnergyMethodOptions::show: approximate_buried_unsat_penalty_hbond_energy_threshold_: -0.25 EnergyMethodOptions::show: approximate_buried_unsat_penalty_burial_atomic_depth_: 4.5 EnergyMethodOptions::show: approximate_buried_unsat_penalty_burial_probe_radius_: 2.3 EnergyMethodOptions::show: approximate_buried_unsat_penalty_burial_resolution_: 0.5 EnergyMethodOptions::show: approximate_buried_unsat_penalty_oversat_penalty_: 1 EnergyMethodOptions::show: approximate_buried_unsat_penalty_assume_const_backbone_:1 EnergyMethodOptions::show: bond_angle_central_atoms_to_score: EnergyMethodOptions::show: bond_angle_residue_type_param_set: none HBondOptions::show: exclude_DNA_DNA: true HBondOptions::show: exclude_intra_res_protein_: false HBondOptions::show: exclude_intra_res_RNA_: false HBondOptions::show: put_intra_into_total_: false HBondOptions::show: exclude_self_hbonds: true HBondOptions::show: use_hb_env_dep: false HBondOptions::show: use_hb_env_dep_DNA: true HBondOptions::show: smooth_hb_env_dep: true HBondOptions::show: bb_donor_acceptor_check: true HBondOptions::show: decompose_bb_hb_into_pair_energies: false HBondOptions::show: params_database_tag_: ref2015_params HBondOptions::show: use_sp2_chi_penalty_: true HBondOptions::show: sp2_BAH180_rise_: 0.75 HBondOptions::show: sp2_outer_width_: 0.357 HBondOptions::show: measure_sp3acc_BAH_from_hvy_: true HBondOptions::show: fade_energy_: 1 HBondOptions::show: exclude_ether_oxygens_: 0 HBondOptions::show: Mbhbond: false HbondOptions::show: mphbond: false HBondOptions::show: hbond_energy_shift: 0 HBondOptions::show: water_hybrid_sf: false RNA_EnergyMethodOptions::show: syn_G_potential_bonus: 0 RNA_EnergyMethodOptions::show: torsion_potential: ps_04282011 RNA_EnergyMethodOptions::show: suiteness_bonus: Richardson RNA_EnergyMethodOptions::show: rna_base_pair_xy_filename: scoring/rna/rna_base_pair_xy.dat FreeDOF_Options::show: free_suite_bonus: -1 FreeDOF_Options::show: free_2HOprime_bonus: -0.5 FreeDOF_Options::show: free_sugar_bonus: -1 FreeDOF_Options::show: pack_phosphate_penalty: 0.25 FreeDOF_Options::show: free_side_chain_bonus: -0.5
Hint: look at the top line that starts with 'weights'
## Your Response Here
### BEGIN SOLUTION
# (fa_atr 1) (fa_rep 0.55) (fa_sol 1) (fa_intra_rep 0.005)
# (fa_intra_sol_xover4 1) (lk_ball_wtd 1) (fa_elec 1)
# (pro_close 1.25) (hbond_sr_bb 1) (hbond_lr_bb 1)
# (hbond_bb_sc 1) (hbond_sc 1) (dslf_fa13 1.25) (omega 0.4)
# (fa_dun 0.7) (p_aa_pp 0.6) (yhh_planarity 0.625) (ref 1)
# (rama_prepro 0.45)
### END SOLUTION
You can also create a custom energy function that includes select terms. Typically, creating a whole new score function is unneccesary because the current one works well in most cases. However, tweaking the current enrgy function by reassigning weights and adding certain energy terms can be useful.
Here, we will make an example energy function with only the van der Waals attractive and repulsive terms, both with weights of 1. We need to use the set_weight()
. Make a new ScoreFunction
and set the weights accordingly. This is how we set the full-atom attractive (fa_atr
) and the full-atom repulsive (fa_rep
) terms.
sfxn2 = ScoreFunction()
sfxn2.set_weight(fa_atr, 1.0)
sfxn2.set_weight(fa_rep, 1.0)
### BEGIN SOLUTION
sfxn2 = ScoreFunction()
sfxn2.set_weight(fa_atr, 1.0)
sfxn2.set_weight(fa_rep, 1.0)
### END SOLUTION
Lets compare the score of ras
using the full-atom ScoreFunction
versus the ScoreFunction
we made above using only the attractive and repulsive terms.
First, print the total energy of ras
using print(sfxn(ras))
Then, print the attractive and repulsive energy only of ras
using print(sfxn2(ras))
# print total energy of ras
### BEGIN SOLUTION
print(sfxn(ras))
### END SOLUTION
# print the attractive and repulsive energy of ras
### BEGIN SOLUTION
print(sfxn2(ras))
### END SOLUTION
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat core.scoring.elec.util: Read 40 countpair representative atoms core.pack.dunbrack.RotamerLibrary: shapovalov_lib_fixes_enable option is true. core.pack.dunbrack.RotamerLibrary: shapovalov_lib::shap_dun10_smooth_level of 1( aka lowest_smooth ) got activated. core.pack.dunbrack.RotamerLibrary: Binary rotamer library selected: /Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin core.pack.dunbrack.RotamerLibrary: Using Dunbrack library binary file '/Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin'. core.pack.dunbrack.RotamerLibrary: Dunbrack 2010 library took 0.450999 seconds to load from binary 1215.729069796814 154.59159174026854
Using the full-atom ScoreFunction
sfxn
, break the energy of ras
down into its individual pieces with the sfxn.show(ras)
method. Which are the three most dominant contributions, and what are their values? Is this what you would have expected? Why? Note which terms are positive and negative
# use the sfxn.show() method
### BEGIN SOLUTION
sfxn.show(ras)
### END SOLUTION
core.scoring: ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ fa_atr 1.000 -1039.246 -1039.246 fa_rep 0.550 1193.837 656.611 fa_sol 1.000 682.582 682.582 fa_intra_rep 0.005 700.419 3.502 fa_intra_sol_xover4 1.000 46.564 46.564 lk_ball_wtd 1.000 -14.597 -14.597 fa_elec 1.000 -195.387 -195.387 pro_close 1.250 97.210 121.513 hbond_sr_bb 1.000 -41.656 -41.656 hbond_lr_bb 1.000 -28.352 -28.352 hbond_bb_sc 1.000 -13.111 -13.111 hbond_sc 1.000 -7.771 -7.771 dslf_fa13 1.250 0.000 0.000 omega 0.400 41.525 16.610 fa_dun 0.700 1296.642 907.650 p_aa_pp 0.600 -25.496 -15.298 yhh_planarity 0.625 0.000 0.000 ref 1.000 47.114 47.114 rama_prepro 0.450 197.781 89.002 --------------------------------------------------- Total weighted score: 1215.729
# Your response here: what are the three most dominant contributions?
Unweighted, individual component energies of each residue in a structure are stored in the Pose
object and can be accessed by the energies()
method. For example, to break down the energy into each residue's contribution, we use:
print(ras.energies().show(<n>))
Where <n>
is the residue number.
What are the total van der Waals, solvation, and hydrogen-bonding contributions of residue 24?
Note: The backbone hydrogen-bonding terms for each residue are not available from the Energies
object. You can get them by using EnergyMethodOptions. See http://www.pyrosetta.org/documentation#TOC-Hydrogen-Bonds-and-Hydrogen-Bond-Scoring.
### BEGIN SOLUTION
print(ras.energies().show(24))
### END SOLUTION
core.scoring.Energies: E fa_atr fa_rep fa_sol fa_intra_repfa_intra_sol_x lk_ball_wtd fa_elec pro_close hbond_sr_bb hbond_lr_bb hbond_bb_sc hbond_sc dslf_fa13 omega fa_dun p_aa_pp yhh_planarity ref rama_prepro core.scoring.Energies: E(i) 24 -7.40 19.03 2.94 8.76 0.09 -0.11 -0.56 0.00 0.00 0.00 0.00 0.00 0.00 0.12 2.68 0.06 0.00 2.30 3.58 None
# your response here
The van der Waals, solvation, and electrostatic terms are atom-atom pairwise energies calculated from a pre-tabulated lookup table, dependent upon the distance between the two atoms and their types. You can access this lookup table, called the etable
directly to check these energy calculations on an atom-by-atom basis. Use the etable_atom_pair_energies
function which returns a triplet of energies for attractive, repulsive and solvation scores.
(Note that the etable_atom_pair_energies()
function requires Atom
objects, not the AtomID
objects we saw in Workshop #2. For more info, look at the documentation.)
Practice: What are the attractive, repulsive, solvation, and electrostatic components between the nitrogen of residue 24 and the oxygen of residue 20?
### BEGIN SOLUTION
res24 = ras.residue(24)
res20 = ras.residue(20)
res24_atomN = res24.atom_index("N")
res20_atomO = res20.atom_index("O")
pyrosetta.etable_atom_pair_energies(res24, res24_atomN, res20, res20_atomO, sfxn)
### END SOLUTION
(-0.1505855046001568, 0.0, 0.5903452111877215, 2.173111777247698)