This notebook contains material from PyRosetta; content is available on Github.

Working With Density

Keywords: crystal, density, refinement, LoadDensityMapMover, map, EM, X-ray

Overview

Density can be a useful tool in Rosetta and can be used to refine a PDB structure, test Rosetta structure prediction methods, and build de-novo models guided by density of different resolutions. It can also be used as an experimentally-determined guide/constraint for custom methods.

Rosetta understands X-ray or electron density through scoring. In this tutorial, we will first walk through how to create a density file that Rosetta understands, and then load it, score it, and refine a structure using it. We will also cover some common tools used while working with density.

More complicated protocols exist through the use of RosettaScripts in PyRosetta, and even more protocols through applications built using main Rosetta. We will not cover these in this introductory tutorial, but please refer to the original references listed below (such as De novo protein structure determination...) for how to run these applications.

Note that symmetry and density can be used together!

Documentation

More information on Density Scoring and relavent applications can be found here:

References

In [13]:
# Notebook setup
import sys
if 'google.colab' in sys.modules:
    !pip install pyrosettacolabsetup
    import pyrosettacolabsetup
    pyrosettacolabsetup.mount_pyrosetta_install()
    print ("Notebook is set for PyRosetta use in Colab.  Have fun!")

Creating a Density Map

We won't actually do this in the tutorial as it requires a third-party application, but the (.ccp4) density map you will be using was created this way.

1) Download and install phenix.maps: https://www.phenix-online.org/documentation/reference/maps.html

  • Make sure phenix.maps is in your path.

2) Download the CIF file from the Protein Data Bank for the structure you are interested in.

3) Use this command to create a map: phenix.maps pdb_path cif_path

  • Note that depending on the cif file and structure, you may need to rarely edit the defaults to get it work properly, especially for glycan structures. Please refer to the phenix documentation for this.

Setup for Density Scoring

Here, we will make a scorefunction and set fast_elec_dens to a proper value and use the LoadDensityMapMover to actually load the density map. Note that (currently) the density map is GLOBAL data within Rosetta - so only a single structure can be modeled/refined at a time.

From the docs:

Several scoring functions have been added to Rosetta which describe how well a structure agrees to experimental density data. Density map data is read in CCP4/MRC format (the density has to minimally cover the asymmetric unit). The various scoring functions trade off speed versus accuracy, and their use should be primarily determined by the resolution of the density map data:

  • elec_dens_fast is recommended for most cases.

Additionally, a slower but more precise scoring function is available. This is only recommended if elec_dens_fast performs poorly (for example, if map quality varies significantly throughout the map):

  • elec_dens_window - Uses the sum of correlations of a sliding window of residues versus the experimental data; structure density only uses all heavy atoms.

The weights to use vary depending on resolution of data but the following give reasonable ranges:

  • elec_dens_fast: ~25 is generally good, higher for high-resolution (<3Å) and lower for low-resolution (>6Å)
  • elec_dens_window: ~0.2 is generally good, higher for high-resolution (<3Å) and lower for low-resolution (>6Å)
In [14]:
from pyrosetta import *
from pyrosetta.rosetta import *
from pyrosetta.teaching import *
import os

init('-ignore_unrecognized_res -load_PDB_components false -ignore_zero_occupancy false @inputs/glycan_flags')
PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.39+release.93456a567a8125cafdf7f8cb44400bc20b570d81 2019-09-26T14:24:44] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
core.init: Checking for fconfig files in pwd and ./rosetta/flags
core.init: Reading fconfig.../Users/jadolfbr/.rosetta/flags/common
core.init: 
core.init: 
core.init: Rosetta version: PyRosetta4.Release.python36.mac r233 2019.39+release.93456a567a8 93456a567a8125cafdf7f8cb44400bc20b570d81 http://www.pyrosetta.org 2019-09-26T14:24:44
core.init: command: PyRosetta -ignore_unrecognized_res -load_PDB_components false -ignore_zero_occupancy false @inputs/glycan_flags -database /Users/jadolfbr/Library/Python/3.6/lib/python/site-packages/pyrosetta-2019.39+release.93456a567a8-py3.6-macosx-10.6-intel.egg/pyrosetta/database
basic.random.init_random_generator: 'RNG device' seed mode, using '/dev/urandom', seed=-370139418 seed_offset=0 real_seed=-370139418
basic.random.init_random_generator: RandomGenerator:init: Normal mode, seed=-370139418 RG_type=mt19937
In [15]:
from rosetta.protocols.cryst import *
from rosetta.protocols.rosetta_scripts import *

p = pose_from_pdb('inputs/1jnd.pdb')

original = p.clone()

#The LoadDensityMapMover unfortunately does not have getters and setters yet.  
# This has been updated in the Rosetta C++ code, but for now, we have to use the XML interface. 

#setup_dens = LoadDensityMapMover("inputs/1jnd_2mFo-DFc_map.ccp4")
setup_dens = XmlObjects.static_get_mover('<LoadDensityMap name="loaddens" mapfile="inputs/1jnd_2mFo-DFc_map.ccp4"/>')
setup_dens.apply(p)
core.import_pose.import_pose: File 'inputs/1jnd.pdb' automatically determined to be of type PDB
core.io.pdb.pdb_reader: Parsing 82 .pdb records with unknown format to search for Rosetta-specific comments.
core.io.util: Automatic glycan connection is activated.
core.io.util: Start reordering residues.
core.io.util: Corrected glycan residue order (internal numbering): [401, 402, 403, 404]
core.io.util: 
core.io.pose_from_sfr.PoseFromSFRBuilder: Setting chain termination for 404
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc401 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc402 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man403 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man404 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401.  Returning BOGUS ID instead.
core.conformation.Residue: [ WARNING ] missing an atom: 401  H1  that depends on a nonexistent polymer connection!
core.conformation.Residue: [ WARNING ]  --> generating it using idealized coordinates.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401.  Returning BOGUS ID instead.
core.conformation.Conformation: Found disulfide between residues 5 32
core.conformation.Conformation: current variant for 5 CYS
core.conformation.Conformation: current variant for 32 CYS
core.conformation.Conformation: current variant for 5 CYD
core.conformation.Conformation: current variant for 32 CYD
core.conformation.Conformation: Found disulfide between residues 302 385
core.conformation.Conformation: current variant for 302 CYS
core.conformation.Conformation: current variant for 385 CYS
core.conformation.Conformation: current variant for 302 CYD
core.conformation.Conformation: current variant for 385 CYD
core.conformation.carbohydrates.GlycanTreeSet: Setting up Glycan Trees
core.conformation.carbohydrates.GlycanTreeSet: Found 1 glycan trees.
protocols.rosetta_scripts.RosettaScriptsParser: Generating XML Schema for rosetta_scripts...
protocols.rosetta_scripts.RosettaScriptsParser: ...done
protocols.rosetta_scripts.RosettaScriptsParser: Initializing schema validator...
protocols.rosetta_scripts.RosettaScriptsParser: ...done
protocols.rosetta_scripts.RosettaScriptsParser: Validating input script...
protocols.rosetta_scripts.RosettaScriptsParser: ...done
protocols.rosetta_scripts.RosettaScriptsParser: Parsed script:
<ROSETTASCRIPTS>
	<MOVERS>
		<LoadDensityMap mapfile="inputs/1jnd_2mFo-DFc_map.ccp4" name="loaddens"/>
	</MOVERS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file.  Setting sugar_bb weight to 1.0 by default.
protocols.rosetta_scripts.RosettaScriptsParser: Defined mover named "loaddens" of type LoadDensityMap
protocols.rosetta_scripts.ParsedProtocol: ParsedProtocol mover with the following movers and filters
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density mapinputs/1jnd_2mFo-DFc_map.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to AUTO
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'inputs/1jnd_2mFo-DFc_map.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 218 x 229 x 265
core.scoring.electron_density.ElectronDensity:      origin: -66 x 82 x 209
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 360 x 360 x 288
core.scoring.electron_density.ElectronDensity:     celldim: 106.345 x 106.345 x 89.987
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 120
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0236128
core.scoring.electron_density.ElectronDensity: Effective resolution = 0.937365

Now, we need to set the pose up for density scoring. The SetupForDensityScoring mover sets a specific foldtree to the pose to allow scoring properly. We will then load a scorefunction with our density scoreterm, and load a pre-refined pose that was refined into the density using the pareto-optimal protocol with density.

In [16]:
setup_dens_pose = rosetta.protocols.electron_density.SetupForDensityScoringMover()

ref = pose_from_pdb('inputs/1jnd_refined.pdb.gz')

setup_dens_pose.apply(p)
setup_dens_pose.apply(ref)

score = get_score_function()
score_dens = get_score_function()
score_dens.set_weight(rosetta.core.scoring.elec_dens_fast, 25)

print("crystal", score_dens(p))
print("refined", score_dens(ref))
core.import_pose.import_pose: File 'inputs/1jnd_refined.pdb.gz' automatically determined to be of type PDB
core.io.pdb.pdb_reader: Parsing 411 .pdb records with unknown format to search for Rosetta-specific comments.
core.io.util: Automatic glycan connection is activated.
core.io.util: Start reordering residues.
core.io.util: Corrected glycan residue order (internal numbering): [401, 402, 403, 404]
core.io.util: 
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc401 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc402 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man403 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man404 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401.  Returning BOGUS ID instead.
core.conformation.Conformation: Found disulfide between residues 5 32
core.conformation.Conformation: current variant for 5 CYS
core.conformation.Conformation: current variant for 32 CYS
core.conformation.Conformation: current variant for 5 CYD
core.conformation.Conformation: current variant for 32 CYD
core.conformation.Conformation: Found disulfide between residues 302 385
core.conformation.Conformation: current variant for 302 CYS
core.conformation.Conformation: current variant for 385 CYS
core.conformation.Conformation: current variant for 302 CYD
core.conformation.Conformation: current variant for 385 CYD
core.conformation.carbohydrates.GlycanTreeSet: Setting up Glycan Trees
core.conformation.carbohydrates.GlycanTreeSet: Found 1 glycan trees.
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file.  Setting sugar_bb weight to 1.0 by default.
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file.  Setting sugar_bb weight to 1.0 by default.
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Bin 1:  B(C/N/O/S)=0 / 0 / 0 / 8.60156  sum=(0,0)
crystal -6968.769963053168
refined -7770.2095688851

Now lets minimize our pose.

In [17]:
minmover = MinMover()
mm = MoveMap()
mm.set_bb(True)
mm.set_chi(True)
minmover.set_movemap(mm)

if not os.getenv("DEBUG"):
    for i in range(1, 5):
        minmover.apply(p)
    
print(score_dens.score(p))
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file.  Setting sugar_bb weight to 1.0 by default.
-6954.982255900806

Why has the score gotten worse here?? Because we are minimizing in dihedral space instead of cartesian space - so we make certain energies better, but crystal refinement works best in cartesian space. Lets now minimize in cartesian and see what happens.

In [18]:
p = original.clone()
setup_dens_pose.apply(p)

score_dens_cart = create_score_function("ref2015_cart")
score_dens_cart.set_weight(rosetta.core.scoring.elec_dens_fast, 25)

#Set Bondlengths and angles to true. This is easier and more straightforward to do if using a MoveMapFactory.
mm.set(rosetta.core.id.THETA, True)

minmover.cartesian(True)
minmover.score_function(score_dens_cart)
minmover.set_movemap(mm)

if not os.getenv("DEBUG"):
    for i in range(1, 5):
        minmover.apply(p)

(print(score_dens.score(p)))
core.scoring.CartesianBondedEnergy: Creating new peptide-bonded energy container (405)
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1
core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1
-7166.488303021679

Is this closer to the pre-refined model (which was pre-refined in Relax)? Do they fit the density better? Are they much more within the density? How does the glycan density compare to the full protein density?

Conclusion

That should get you started using density with typical Rosetta modeling tasks! See the references for more complex protocols.

In [ ]:
 

Chapter contributors:

  • Jared Adolf-Bryfogle (Scripps; Institute for Protein Innovation)