This notebook contains material from PyRosetta; content is available on Github.

# Setting up a membrane protein in the bilayer¶

## Getting Started: Setting up the protein in the lipid bilayer¶

To start modeling membrane proteins, we must place the protein in the lipid bilayer. This begs an important question: how should the protein be oriented? The orientation of a protein in the bilayer is driven by a number of biophysical factors, such as burying nonpolar side chains in the hydrophobic membrane. For RosettaMP, there are three ways to choose the initial orientation. The choice is up to you, and often depends on how much information you have about your protein beforehand.

In [0]:
# Notebook setup
import sys
!pip install pyrosettacolabsetup
import pyrosettacolabsetup
pyrosettacolabsetup.mount_pyrosetta_install()
print ("Notebook is set for PyRosetta use in Colab.  Have fun!")

Collecting pyrosettacolabsetup
Building wheels for collected packages: pyrosettacolabsetup
Building wheel for pyrosettacolabsetup (setup.py) ... done
Created wheel for pyrosettacolabsetup: filename=pyrosettacolabsetup-0.1-cp36-none-any.whl size=1694 sha256=a57372333cb399736ec8ad414460dc3e88fbb9e6e8106d86585dc13d4cdb67a0
Stored in directory: /root/.cache/pip/wheels/3a/2d/68/2a5b479b424b3df2b96d725177e1f42c9b85c446965d566c6c
Successfully built pyrosettacolabsetup
Installing collected packages: pyrosettacolabsetup
Successfully installed pyrosettacolabsetup-0.1

··········
Notebook is set for PyRosetta use in Colab.  Have fun!

In [0]:
from pyrosetta import *
pyrosetta.init()

PyRosetta-4 2019 [Rosetta PyRosetta4.MinSizeRel.python36.linux 2019.50+release.91b7a940f06ab065a81d9ce3046b08eef0de0b31 2019-12-12T23:03:24] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
core.init: Checking for fconfig files in pwd and ./rosetta/flags
core.init: Rosetta version: PyRosetta4.MinSizeRel.python36.linux r240 2019.50+release.91b7a94 91b7a940f06ab065a81d9ce3046b08eef0de0b31 http://www.pyrosetta.org 2019-12-12T23:03:24
core.init: command: PyRosetta -ex1 -ex2aro -database /content/prefix/pyrosetta-2019.50+release.91b7a94-py3.6-linux-x86_64.egg/pyrosetta/database
basic.random.init_random_generator: 'RNG device' seed mode, using '/dev/urandom', seed=1541953743 seed_offset=0 real_seed=1541953743
basic.random.init_random_generator: RandomGenerator:init: Normal mode, seed=1541953743 RG_type=mt19937


Make sure you are in the right directory for accessing the .pdb files:

cd google_drive/My\ Drive/student-notebooks/

In [0]:
#cd google_drive/My\ Drive/student-notebooks/

/content/google_drive/My Drive/student-notebooks


In [0]:
from pyrosetta.toolbox import cleanATOM
cleanATOM("inputs/1afo.pdb")
pose = pose_from_pdb("inputs/1afo.clean.pdb")


Then, initialize RosettaMP using AddMembraneMover. In this option, the orientation is known and you can estimate the transmembrane spans from the orientation. Therefore, we tell RosettaMP to estimate the spanning topology from structure:

In [0]:
from pyrosetta.rosetta.protocols.membrane import *


### Option 2: Estimate the transmembrane spans and use this information to choose an orientation¶

In this option, you will need to figure out what the transmembrane spans are. For this, you can used a sequence-based server such as OCTOPUS (http://octopus.cbr.su.se ). You will need to find the sequence of 1AFO on the PDB, copy/paste the sequence of one of the chains into OCTOPUS, and then save the output as a text file. Then, you will need to convert the output from OCTOPUS to the Rosetta format using the octopus2memb script.

Next, initialize RosettaMP with AddMembraneMover. Here, instead of specifying “from_structure”, you will specify the path to your spanning topology file:

In [ ]:
from pyrosetta.rosetta.protocols.membrane import *
if not os.getenv("DEBUG"):


## Key Concepts for the membrane representation¶

1. AddMembraneMover adds an additional residue to the protein called the Membrane residue. It is not a physical residue, but it contains information about the membrane. Note that AddMembraneMover attaches the MEM residue to the protein in Rosetta’s representation, but it does not physically exist as a residue. This is a special kind of connection called a “jump edge” whereas connections between the actual residues like are called “peptide edges” (more on that in the fold tree section).
2. The spanning information is stored in a SpanningTopology object

Let’s check some information about our current pose:

print(pose.conformation())
print(pose.conformation().membrane_info())



pose.conformation() shows information about all residues in the pose, fold_tree() shows information about the Edges of the FoldTree, and membrane_info() shows information about the membrane residue.

In [0]:
if not os.getenv("DEBUG"):
### BEGIN SOLUTION
print(pose.conformation())
print(pose.conformation().membrane_info())
### END SOLUTION


Questions: How many residues compose 1AFO? Which residue is the Membrane residue? How many transmembrane spans does membrane_info() say there are?

In [0]:



## Fold Tree¶

Understanding the fold tree is necessary to use movers that move parts of the protein with respect to other parts of the protein. For example, TiltMover requires a jump number and tilts the section after the jump number by a specified amount. SpinAroundPartnerMover spins one partner with respect to another, which also requires a jump number. We will explain the terminology shortly! Enter this code in the Python command line: print(pose.conformation().fold_tree())

In [0]:
if not os.getenv("DEBUG"):
### BEGIN SOLUTION
print(pose.conformation().fold_tree())
### END SOLUTION


1AFO is a relatively simple protein with 2 chains, however PyMOL shows 3 chains. Next to the “1AFO_AB.pdb” line in PyMOL, click “label” and then “chains”. Select Chain C, then select “label” and then “residue name”. What is the only residue in Chain C, and therefore what does the third chain represent? Does it make sense that Chain C is the membrane representation and not physically part of the protein?

This information is shown in the fold tree data above, where we see one jump edge between residues 1 and 41, and a second jump edge for the membrane representation connecting MEM “residue” 81 to residue 1. Jump edges have a positive final number which increments for each jump. The edges with a negative final number indicate a peptide edge. Jump edges represent parts of the protein that are not physically connected to each other, and peptide edges represent parts that are physically connected.

Edge 1 40 -1 means that the edge connects residue 1 to residue 40, and it’s a physical connection. Therefore what does this Edge represent? It represents Chain A.

Edge 1 41 1 means that there is a physical separation between residues 1 and 41. Therefore what does this Edge represent? It represents the separation between Chain A and Chain B.

For a more in-depth review of fold trees, look at Rosetta documentation (https://www.rosettacommons.org/demos/latest/tutorials/fold_tree/fold_tree).

The key takeaway is that if we wanted to tilt one part of the protein with respect to another part of the protein, it doesn’t make sense to give TiltMover jump number 2, which is the membrane jump. It does make sense to give TiltMover jump number 1, because then we’re asking TiltMover to tilt Chain B with respect to Chain A.

In [0]: