To start modeling membrane proteins, we must place the protein in the lipid bilayer. This begs an important question: how should the protein be oriented? The orientation of a protein in the bilayer is driven by a number of biophysical factors, such as burying nonpolar side chains in the hydrophobic membrane. For RosettaMP, there are three ways to choose the initial orientation. The choice is up to you, and often depends on how much information you have about your protein beforehand.
# Notebook setup import sys if 'google.colab' in sys.modules: !pip install pyrosettacolabsetup import pyrosettacolabsetup pyrosettacolabsetup.mount_pyrosetta_install() print ("Notebook is set for PyRosetta use in Colab. Have fun!")
Collecting pyrosettacolabsetup Downloading https://files.pythonhosted.org/packages/31/11/10a9140931c88352b51a91d2e6be8b929b6560b3683e5517d88c271ceb37/pyrosettacolabsetup-0.1.tar.gz Building wheels for collected packages: pyrosettacolabsetup Building wheel for pyrosettacolabsetup (setup.py) ... done Created wheel for pyrosettacolabsetup: filename=pyrosettacolabsetup-0.1-cp36-none-any.whl size=1694 sha256=a57372333cb399736ec8ad414460dc3e88fbb9e6e8106d86585dc13d4cdb67a0 Stored in directory: /root/.cache/pip/wheels/3a/2d/68/2a5b479b424b3df2b96d725177e1f42c9b85c446965d566c6c Successfully built pyrosettacolabsetup Installing collected packages: pyrosettacolabsetup Successfully installed pyrosettacolabsetup-0.1 Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly Enter your authorization code: ·········· Mounted at /content/google_drive Notebook is set for PyRosetta use in Colab. Have fun!
from pyrosetta import * pyrosetta.init()
PyRosetta-4 2019 [Rosetta PyRosetta4.MinSizeRel.python36.linux 2019.50+release.91b7a940f06ab065a81d9ce3046b08eef0de0b31 2019-12-12T23:03:24] retrieved from: http://www.pyrosetta.org (C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team. core.init: Checking for fconfig files in pwd and ./rosetta/flags core.init: Rosetta version: PyRosetta4.MinSizeRel.python36.linux r240 2019.50+release.91b7a94 91b7a940f06ab065a81d9ce3046b08eef0de0b31 http://www.pyrosetta.org 2019-12-12T23:03:24 core.init: command: PyRosetta -ex1 -ex2aro -database /content/prefix/pyrosetta-2019.50+release.91b7a94-py3.6-linux-x86_64.egg/pyrosetta/database basic.random.init_random_generator: 'RNG device' seed mode, using '/dev/urandom', seed=1541953743 seed_offset=0 real_seed=1541953743 basic.random.init_random_generator: RandomGenerator:init: Normal mode, seed=1541953743 RG_type=mt19937
Make sure you are in the right directory for accessing the
cd google_drive/My\ Drive/student-notebooks/
#cd google_drive/My\ Drive/student-notebooks/
from pyrosetta.toolbox import cleanATOM cleanATOM("inputs/1afo.pdb") pose = pose_from_pdb("inputs/1afo.clean.pdb")
Then, initialize RosettaMP using AddMembraneMover. In this option, the orientation is known and you can estimate the transmembrane spans from the orientation. Therefore, we tell RosettaMP to estimate the spanning topology from structure:
from pyrosetta.rosetta.protocols.membrane import * addmem = AddMembraneMover("from_structure") addmem.apply(pose)
In this option, you will need to figure out what the transmembrane spans are. For this, you can used a sequence-based server such as OCTOPUS (http://octopus.cbr.su.se ). You will need to find the sequence of 1AFO on the PDB, copy/paste the sequence of one of the chains into OCTOPUS, and then save the output as a text file. Then, you will need to convert the output from OCTOPUS to the Rosetta format using the
Next, initialize RosettaMP with AddMembraneMover. Here, instead of specifying “from_structure”, you will specify the path to your spanning topology file:
from pyrosetta.rosetta.protocols.membrane import * if not os.getenv("DEBUG"): addmem = AddMembraneMover("inputs/1afo.span") addmem.apply(pose)
Let’s check some information about our current pose:
pose.conformation() shows information about all residues in the pose, fold_tree() shows information about the Edges of the FoldTree, and membrane_info() shows information about the membrane residue.
if not os.getenv("DEBUG"): ### BEGIN SOLUTION print(pose.conformation()) print(pose.conformation().membrane_info()) ### END SOLUTION
Questions: How many residues compose 1AFO? Which residue is the Membrane residue? How many transmembrane spans does membrane_info() say there are?
Understanding the fold tree is necessary to use movers that move parts of the protein with respect to other parts of the protein. For example, TiltMover requires a jump number and tilts the section after the jump number by a specified amount. SpinAroundPartnerMover spins one partner with respect to another, which also requires a jump number. We will explain the terminology shortly! Enter this code in the Python command line:
if not os.getenv("DEBUG"): ### BEGIN SOLUTION print(pose.conformation().fold_tree()) ### END SOLUTION
1AFO is a relatively simple protein with 2 chains, however PyMOL shows 3 chains. Next to the “1AFO_AB.pdb” line in PyMOL, click “label” and then “chains”. Select Chain C, then select “label” and then “residue name”. What is the only residue in Chain C, and therefore what does the third chain represent? Does it make sense that Chain C is the membrane representation and not physically part of the protein?
This information is shown in the fold tree data above, where we see one jump edge between residues 1 and 41, and a second jump edge for the membrane representation connecting MEM “residue” 81 to residue 1. Jump edges have a positive final number which increments for each jump. The edges with a negative final number indicate a peptide edge. Jump edges represent parts of the protein that are not physically connected to each other, and peptide edges represent parts that are physically connected.
Edge 1 40 -1 means that the edge connects residue 1 to residue 40, and it’s a physical connection. Therefore what does this Edge represent? It represents Chain A.
Edge 1 41 1 means that there is a physical separation between residues 1 and 41. Therefore what does this Edge represent? It represents the separation between Chain A and Chain B.
For a more in-depth review of fold trees, look at Rosetta documentation (https://www.rosettacommons.org/demos/latest/tutorials/fold_tree/fold_tree).
The key takeaway is that if we wanted to tilt one part of the protein with respect to another part of the protein, it doesn’t make sense to give TiltMover jump number 2, which is the membrane jump. It does make sense to give TiltMover jump number 1, because then we’re asking TiltMover to tilt Chain B with respect to Chain A.