Keywords: crystal, density, refinement, LoadDensityMapMover, map, EM, X-ray
Density can be a useful tool in Rosetta and can be used to refine a PDB structure, test Rosetta structure prediction methods, and build de-novo models guided by density of different resolutions. It can also be used as an experimentally-determined guide/constraint for custom methods.
Rosetta understands X-ray or electron density through scoring. In this tutorial, we will first walk through how to create a density file that Rosetta understands, and then load it, score it, and refine a structure using it. We will also cover some common tools used while working with density.
More complicated protocols exist through the use of RosettaScripts
in PyRosetta, and even more protocols through applications built using main Rosetta. We will not cover these in this introductory tutorial, but please refer to the original references listed below (such as De novo protein structure determination...) for how to run these applications.
Note that symmetry and density can be used together!
More information on Density Scoring and relavent applications can be found here:
Atomic-accuracy models from 4.5-Å cryo-electron microscopy data with density-guided iterative local refinement.
De novo protein structure determination from near-atomic-resolution cryo-EM maps.
Tools for Model Building and Optimization into Near-Atomic Resolution Electron Cryo-Microscopy Density Maps.
Automated structure refinement of macromolecular assemblies from cryo-EM maps using Rosetta.
Rosetta Structure Prediction as a Tool for Solving Difficult Molecular Replacement Problems.
RosettaES: a sampling strategy enabling automated interpretation of difficult cryo-EM maps.
Automatically Fixing Errors in Glycoprotein Structures with Rosetta
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
We won't actually do this in the tutorial as it requires a third-party application, but the (.ccp4) density map you will be using was created this way.
Download the CIF file from the Protein Data Bank for the structure you are interested in.
Use this command to create a map: phenix.maps pdb_path cif_path
Here, we will make a scorefunction and set fast_elec_dens
to a proper value and use the LoadDensityMapMover
to actually load the density map. Note that (currently) the density map is GLOBAL data within Rosetta - so only a single structure can be modeled/refined at a time.
Several scoring functions have been added to Rosetta which describe how well a structure agrees to experimental density data. Density map data is read in CCP4/MRC format (the density has to minimally cover the asymmetric unit). The various scoring functions trade off speed versus accuracy, and their use should be primarily determined by the resolution of the density map data:
Additionally, a slower but more precise scoring function is available. This is only recommended if elec_dens_fast performs poorly (for example, if map quality varies significantly throughout the map):
The weights to use vary depending on resolution of data but the following give reasonable ranges:
from pyrosetta import *
from pyrosetta.rosetta import *
from pyrosetta.teaching import *
import os
init('-ignore_unrecognized_res -load_PDB_components false -ignore_zero_occupancy false @inputs/glycan_flags')
PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.39+release.93456a567a8125cafdf7f8cb44400bc20b570d81 2019-09-26T14:24:44] retrieved from: http://www.pyrosetta.org (C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team. core.init: Checking for fconfig files in pwd and ./rosetta/flags core.init: Reading fconfig.../Users/jadolfbr/.rosetta/flags/common core.init: core.init: core.init: Rosetta version: PyRosetta4.Release.python36.mac r233 2019.39+release.93456a567a8 93456a567a8125cafdf7f8cb44400bc20b570d81 http://www.pyrosetta.org 2019-09-26T14:24:44 core.init: command: PyRosetta -ignore_unrecognized_res -load_PDB_components false -ignore_zero_occupancy false @inputs/glycan_flags -database /Users/jadolfbr/Library/Python/3.6/lib/python/site-packages/pyrosetta-2019.39+release.93456a567a8-py3.6-macosx-10.6-intel.egg/pyrosetta/database basic.random.init_random_generator: 'RNG device' seed mode, using '/dev/urandom', seed=-370139418 seed_offset=0 real_seed=-370139418 basic.random.init_random_generator: RandomGenerator:init: Normal mode, seed=-370139418 RG_type=mt19937
from rosetta.protocols.cryst import *
from rosetta.protocols.rosetta_scripts import *
p = pose_from_pdb('inputs/1jnd.pdb')
original = p.clone()
#The LoadDensityMapMover unfortunately does not have getters and setters yet.
# This has been updated in the Rosetta C++ code, but for now, we have to use the XML interface.
#setup_dens = LoadDensityMapMover("inputs/1jnd_2mFo-DFc_map.ccp4")
setup_dens = XmlObjects.static_get_mover('<LoadDensityMap name="loaddens" mapfile="inputs/1jnd_2mFo-DFc_map.ccp4"/>')
setup_dens.apply(p)
core.import_pose.import_pose: File 'inputs/1jnd.pdb' automatically determined to be of type PDB core.io.pdb.pdb_reader: Parsing 82 .pdb records with unknown format to search for Rosetta-specific comments. core.io.util: Automatic glycan connection is activated. core.io.util: Start reordering residues. core.io.util: Corrected glycan residue order (internal numbering): [401, 402, 403, 404] core.io.util: core.io.pose_from_sfr.PoseFromSFRBuilder: Setting chain termination for 404 core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc401 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc402 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man403 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man404 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401. Returning BOGUS ID instead. core.conformation.Residue: [ WARNING ] missing an atom: 401 H1 that depends on a nonexistent polymer connection! core.conformation.Residue: [ WARNING ] --> generating it using idealized coordinates. core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401. Returning BOGUS ID instead. core.conformation.Conformation: Found disulfide between residues 5 32 core.conformation.Conformation: current variant for 5 CYS core.conformation.Conformation: current variant for 32 CYS core.conformation.Conformation: current variant for 5 CYD core.conformation.Conformation: current variant for 32 CYD core.conformation.Conformation: Found disulfide between residues 302 385 core.conformation.Conformation: current variant for 302 CYS core.conformation.Conformation: current variant for 385 CYS core.conformation.Conformation: current variant for 302 CYD core.conformation.Conformation: current variant for 385 CYD core.conformation.carbohydrates.GlycanTreeSet: Setting up Glycan Trees core.conformation.carbohydrates.GlycanTreeSet: Found 1 glycan trees. protocols.rosetta_scripts.RosettaScriptsParser: Generating XML Schema for rosetta_scripts... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Initializing schema validator... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Validating input script... protocols.rosetta_scripts.RosettaScriptsParser: ...done protocols.rosetta_scripts.RosettaScriptsParser: Parsed script: <ROSETTASCRIPTS> <MOVERS> <LoadDensityMap mapfile="inputs/1jnd_2mFo-DFc_map.ccp4" name="loaddens"/> </MOVERS> <PROTOCOLS/> </ROSETTASCRIPTS> core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015 core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file. Setting sugar_bb weight to 1.0 by default. protocols.rosetta_scripts.RosettaScriptsParser: Defined mover named "loaddens" of type LoadDensityMap protocols.rosetta_scripts.ParsedProtocol: ParsedProtocol mover with the following movers and filters core.scoring.electron_density.ElectronDensity: Loading Density Map core.scoring.electron_density.ElectronDensity: Loading density mapinputs/1jnd_2mFo-DFc_map.ccp4 core.scoring.electron_density.ElectronDensity: Setting resolution to AUTO core.scoring.electron_density.ElectronDensity: atom mask to 3.2A core.scoring.electron_density.ElectronDensity: CA mask to 6A core.scoring.electron_density.ElectronDensity: Read density map'inputs/1jnd_2mFo-DFc_map.ccp4' core.scoring.electron_density.ElectronDensity: extent: 218 x 229 x 265 core.scoring.electron_density.ElectronDensity: origin: -66 x 82 x 209 core.scoring.electron_density.ElectronDensity: altorigin: 0 x 0 x 0 core.scoring.electron_density.ElectronDensity: grid: 360 x 360 x 288 core.scoring.electron_density.ElectronDensity: celldim: 106.345 x 106.345 x 89.987 core.scoring.electron_density.ElectronDensity: cellangles: 90 x 90 x 120 core.scoring.electron_density.ElectronDensity: voxel vol.: 0.0236128 core.scoring.electron_density.ElectronDensity: Effective resolution = 0.937365
Now, we need to set the pose up for density scoring. The SetupForDensityScoring
mover sets a specific foldtree to the pose to allow scoring properly. We will then load a scorefunction with our density scoreterm, and load a pre-refined pose that was refined into the density using the pareto-optimal protocol with density.
setup_dens_pose = rosetta.protocols.electron_density.SetupForDensityScoringMover()
ref = pose_from_pdb('inputs/1jnd_refined.pdb.gz')
setup_dens_pose.apply(p)
setup_dens_pose.apply(ref)
score = get_score_function()
score_dens = get_score_function()
score_dens.set_weight(rosetta.core.scoring.elec_dens_fast, 25)
print("crystal", score_dens(p))
print("refined", score_dens(ref))
core.import_pose.import_pose: File 'inputs/1jnd_refined.pdb.gz' automatically determined to be of type PDB core.io.pdb.pdb_reader: Parsing 411 .pdb records with unknown format to search for Rosetta-specific comments. core.io.util: Automatic glycan connection is activated. core.io.util: Start reordering residues. core.io.util: Corrected glycan residue order (internal numbering): [401, 402, 403, 404] core.io.util: core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc401 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc402 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man403 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Man404 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned. core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 401. Returning BOGUS ID instead. core.conformation.Conformation: Found disulfide between residues 5 32 core.conformation.Conformation: current variant for 5 CYS core.conformation.Conformation: current variant for 32 CYS core.conformation.Conformation: current variant for 5 CYD core.conformation.Conformation: current variant for 32 CYD core.conformation.Conformation: Found disulfide between residues 302 385 core.conformation.Conformation: current variant for 302 CYS core.conformation.Conformation: current variant for 385 CYS core.conformation.Conformation: current variant for 302 CYD core.conformation.Conformation: current variant for 385 CYD core.conformation.carbohydrates.GlycanTreeSet: Setting up Glycan Trees core.conformation.carbohydrates.GlycanTreeSet: Found 1 glycan trees. core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015 core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file. Setting sugar_bb weight to 1.0 by default. core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015 core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file. Setting sugar_bb weight to 1.0 by default. core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267] core.scoring.electron_density.ElectronDensity: Bin 1: B(C/N/O/S)=0 / 0 / 0 / 8.60156 sum=(0,0) crystal -6968.769963053168 refined -7770.2095688851
Now lets minimize our pose.
minmover = MinMover()
mm = MoveMap()
mm.set_bb(True)
mm.set_chi(True)
minmover.set_movemap(mm)
if not os.getenv("DEBUG"):
for i in range(1, 5):
minmover.apply(p)
print(score_dens.score(p))
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.ScoreFunctionFactory: The -include_sugars flag was used with no sugar_bb weight set in the weights file. Setting sugar_bb weight to 1.0 by default.
-6954.982255900806
Why has the score gotten worse here?? Because we are minimizing in dihedral space instead of cartesian space - so we make certain energies better, but crystal refinement works best in cartesian space. Lets now minimize in cartesian and see what happens.
p = original.clone()
setup_dens_pose.apply(p)
score_dens_cart = create_score_function("ref2015_cart")
score_dens_cart.set_weight(rosetta.core.scoring.elec_dens_fast, 25)
#Set Bondlengths and angles to true. This is easier and more straightforward to do if using a MoveMapFactory.
mm.set(rosetta.core.id.THETA, True)
minmover.cartesian(True)
minmover.score_function(score_dens_cart)
minmover.set_movemap(mm)
if not os.getenv("DEBUG"):
for i in range(1, 5):
minmover.apply(p)
(print(score_dens.score(p)))
core.scoring.CartesianBondedEnergy: Creating new peptide-bonded energy container (405) core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 5 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 32 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 302 BRANCH 1 core.pose.util: [ WARNING ] Unable to find atom_tree atom for this Rosetta branch connection angle: residue 385 BRANCH 1 -7166.488303021679
Is this closer to the pre-refined model (which was pre-refined in Relax)? Do they fit the density better? Are they much more within the density? How does the glycan density compare to the full protein density?
That should get you started using density with typical Rosetta modeling tasks! See the references for more complex protocols.
Chapter contributors: