Notebook

ResidueSelectors的逻辑¶

@Author: 槐喆 @email：zhe.huai@xtalpi.com

@Proofread: 吴炜坤 @email：weikun.wu@xtalpi.com

氨基酸选择器(ResidueSelector)具有十分重要的功能。它能够从蛋白质结构(Pose)中选取并生成氨基酸子集。一旦生成了这些子集，对后续建模的逻辑操作具有重大的意义，比如可以定义设计或采样的自由度（使用ResidueSelector可以将蛋白质距离内核中心5埃范围内的氨基酸选择出来，后续进行氨基酸侧链能量最小化等结构优化），也可以配合SimpleMetrics、Filter等进行蛋白质性质或参数的统计。

注: ResidueSelectors的概念比较简单也比较利于初学者理解，因此此章节学习难度较小。

一、 ResidueSelector与vector1_bool¶

在PyRosetta中，定义好ResidueSelectors后，进行apply(可以理解为执行选择的过程)，我们将得到氨基酸残基的子集列表。这个列表被保存在vector1对象中，以下以具体的实例进行讲解:

In [1]:

# 导入链选择器
from pyrosetta import pose_from_pdb, init
from pyrosetta.rosetta.core.select.residue_selector import ChainSelector
init()
# 从pdb中读入生成pose对象，(肝细胞生长因子抗体PDB:6LZ9)
pose = pose_from_pdb('./data/6LZ9_H_L.pdb')

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
core.init: {0} Checking for fconfig files in pwd and ./rosetta/flags
core.init: {0} Rosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
core.init: {0} command: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
basic.random.init_random_generator: {0} 'RNG device' seed mode, using '/dev/urandom', seed=1885576095 seed_offset=0 real_seed=1885576095 thread_index=0
basic.random.init_random_generator: {0} RandomGenerator:init: Normal mode, seed=1885576095 RG_type=mt19937
core.chemical.GlobalResidueTypeSet: {0} Finished initializing fa_standard residue type set.  Created 983 residue types
core.chemical.GlobalResidueTypeSet: {0} Total time to initialize 0.651952 seconds.
core.import_pose.import_pose: {0} File './data/6LZ9_H_L.pdb' automatically determined to be of type PDB
core.conformation.Conformation: {0} Found disulfide between residues 21 94
core.conformation.Conformation: {0} current variant for 21 CYS
core.conformation.Conformation: {0} current variant for 94 CYS
core.conformation.Conformation: {0} current variant for 21 CYD
core.conformation.Conformation: {0} current variant for 94 CYD
core.conformation.Conformation: {0} Found disulfide between residues 141 206
core.conformation.Conformation: {0} current variant for 141 CYS
core.conformation.Conformation: {0} current variant for 206 CYS
core.conformation.Conformation: {0} current variant for 141 CYD
core.conformation.Conformation: {0} current variant for 206 CYD

(图片来源: 晶泰科技团队)

In [2]:

# 先来看抗体的残基基本信息:
print(pose.pdb_info())

PDB file name: ./data/6LZ9_H_L.pdb
 Pose Range  Chain    PDB Range  |   #Residues         #Atoms

0001 -- 0081    H 0002  -- 0082  |   0081 residues;    01283 atoms
0082 -- 0082    H 0082A -- 0082A |   0001 residues;    00011 atoms
0083 -- 0083    H 0082B -- 0082B |   0001 residues;    00011 atoms
0084 -- 0084    H 0082C -- 0082C |   0001 residues;    00019 atoms
0085 -- 0102    H 0083  -- 0100  |   0018 residues;    00271 atoms
0103 -- 0103    H 0100A -- 0100A |   0001 residues;    00010 atoms
0104 -- 0104    H 0100B -- 0100B |   0001 residues;    00021 atoms
0105 -- 0105    H 0100C -- 0100C |   0001 residues;    00021 atoms
0106 -- 0106    H 0100D -- 0100D |   0001 residues;    00010 atoms
0107 -- 0107    H 0100E -- 0100E |   0001 residues;    00017 atoms
0108 -- 0118    H 0101  -- 0111  |   0011 residues;    00160 atoms
0119 -- 0223    L 0001  -- 0105  |   0105 residues;    01600 atoms
                           TOTAL |   0223 residues;    03434 atoms

In [3]:

print(f'抗体含有的链数量:{pose.num_chains()}')
print(f'抗体含有的氨基酸数量:{pose.total_residue()}')

抗体含有的链数量:2
抗体含有的氨基酸数量:223

可见抗体中，共有两条链。H链氨基酸范围是1-118，L链氨基酸范围是119-223。

In [4]:

# 选择抗体的重链，PDB链号为"H":
select_heavy_chain = ChainSelector('H')
selected = select_heavy_chain.apply(pose)
print(selected)

vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

结果解读
知识点1: vector1_bool中被选择的氨基酸返回“1”，而没有被选择的氨基酸返回“0”
知识点2: vector1_bool中是按照Pose编号进行编写的(从1开始)，也就是说重链的编号从1 -> n, 轻链的编号从n+1 -> 223.

验证选择器是否正确:

In [5]:

index_list = [index+1 for index, i in enumerate(selected) if i == 1]
print(index_list)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118]

可见选择器正确选择了重链的所有氨基酸。

二、 ResidueSelector的可视化¶

PyRosetta中内置SelectedResiduesPyMOLMetric的函数，可以直接显示被选择的氨基酸。

In [6]:

from pyrosetta.rosetta.core.simple_metrics.metrics import SelectedResiduesPyMOLMetric
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(select_heavy_chain)
prefix = 'heavy_chain_'
pymol_selected.apply(pose, prefix)

In [7]:

from pyrosetta.rosetta.core.simple_metrics import get_sm_data
sm_data = get_sm_data(pose)
string_metric = sm_data.get_string_metric_data()
string_metric['heavy_chain_pymol_selection']

Out[7]:

'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111)'

第一步，在PyMol中的cmd对话框输入上述的选择命令;

第二步，用棍棒形式呈现 show sticks, rosetta_sele

(图片来源: 晶泰科技团队)

三、 ResidueSelector的应用实例¶

氨基酸选择器按功能可分为三大类:

逻辑选择器
非构象依赖选择器
构象依赖选择器

以下我们将逐步来讲解在实战中，都有哪些氨基酸选择可以为我们所用。
这一节主要简单示例，下一节将详细讲解不同的API。

3.1 逻辑选择器¶

第一部分是逻辑选择器，很好理解，按照逻辑分类为Not、And、Or逻辑关系，可以将两个选择器进行逻辑的再次选择。在Rosetta中，负责逻辑定义的选择器为NotResidueSelector、AndResidueSelector、OrResidueSelector。以下做实例说明:

In [8]:

# 还是以之前读入的抗体pose为例。
# 先定义选择的链Selector:
select_heavy_chain = ChainSelector('H')
select_light_chain = ChainSelector('L')
select_light_chain.apply(pose)

Out[8]:

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

In [9]:

# 可视化，选择轻链
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(select_light_chain)
prefix = 'light_chain_'
pymol_selected.apply(pose, prefix)

In [10]:

string_metric = sm_data.get_string_metric_data()
string_metric['light_chain_pymol_selection']

Out[10]:

'select rosetta_sele, (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'

(图片来源: 晶泰科技团队)

In [11]:

#example1: 选择轻链**或**重链
from pyrosetta.rosetta.core.select.residue_selector import OrResidueSelector
light_or_heavy = OrResidueSelector(select_heavy_chain, select_light_chain)
residue_selector = light_or_heavy.apply(pose)
print(residue_selector)

vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

In [12]:

# 可视化 选择轻链**或**重链
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(light_or_heavy)
prefix = 'light_or_heavy_'
pymol_selected.apply(pose, prefix)

In [13]:

string_metric = sm_data.get_string_metric_data()
string_metric['light_or_heavy_pymol_selection']

Out[13]:

'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111) or (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'

(图片来源: 晶泰科技团队)

In [14]:

#example2: 选择重链**且**轻链
from pyrosetta.rosetta.core.select.residue_selector import AndResidueSelector
light_and_heavy = AndResidueSelector(select_heavy_chain, select_light_chain)
residue_selector = light_and_heavy.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [15]:

# 可视化选择重链**且**轻链
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(light_and_heavy)
prefix = 'light_and_heavy_'
pymol_selected.apply(pose, prefix)

In [16]:

string_metric = sm_data.get_string_metric_data()
string_metric['light_and_heavy_pymol_selection']

Out[16]:

'select rosetta_sele, '

(图片来源: 晶泰科技团队)

重链和轻链之间没有交集，所以选择的结果是空集

In [17]:

#example3: 非选择器:
from pyrosetta.rosetta.core.select.residue_selector import NotResidueSelector
not_heavy = NotResidueSelector(select_heavy_chain)
residue_selector = not_heavy.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

In [18]:

# 可视化选择 非重链
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(not_heavy)
prefix = 'not_heavy_'
pymol_selected.apply(pose, prefix)

In [19]:

string_metric = sm_data.get_string_metric_data()
string_metric['not_heavy_pymol_selection']

Out[19]:

'select rosetta_sele, (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'

(图片来源: 晶泰科技团队)

In [20]:

#example4: 选择整个Pose
from pyrosetta.rosetta.core.select.residue_selector import TrueResidueSelector
true = TrueResidueSelector()
residue_selector = true.apply(pose)
print(residue_selector)

vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

In [21]:

# 可视化选择 整个Pose
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(true)
prefix = 'entire_pose_'
pymol_selected.apply(pose, prefix)

In [22]:

string_metric = sm_data.get_string_metric_data()
string_metric['entire_pose_pymol_selection']

Out[22]:

'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111) or (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'

(图片来源: 晶泰科技团队)

3.2 非构象依赖的选择器¶

这类选择器的定义不依赖于具体的构象，仅仅依靠属性就可以定义。如氨基酸的序号，氨基酸的名称等。此次简单举两个例子进行说明。

3.2.1 ResidueIndexSelector

通过氨基酸的具体编号定义的选择器，不仅可以使用PDB编号、Pose编号，还可以指定氨基酸的范围进行选择。

In [23]:

from pyrosetta.rosetta.core.select.residue_selector import ResidueIndexSelector
# 根据具体的Pose编号选择:
pose_index_selector = ResidueIndexSelector('40,42,44')
residue_selector = pose_index_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [24]:

# 可视化选择 特定的残基位点
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(pose_index_selector)
prefix = 'index_select_'
pymol_selected.apply(pose, prefix)

In [25]:

string_metric = sm_data.get_string_metric_data()
string_metric['index_select_pymol_selection']

Out[25]:

'select rosetta_sele, (chain H and resid 41,43,45)'

(图片来源: 晶泰科技团队)

In [26]:

#example1: 根据具体的PDB编号选择, 注意需要附带上PDB链的信息。
pdb_index_selector = ResidueIndexSelector('62H,63H,64H')
residue_selector = pdb_index_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [27]:

# 可视化选择 根据PDB编号选择的残基
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(pdb_index_selector)
prefix = 'pdb_index_select_'
pymol_selected.apply(pose, prefix)

In [28]:

string_metric = sm_data.get_string_metric_data()
string_metric['pdb_index_select_pymol_selection']

Out[28]:

'select rosetta_sele, (chain H and resid 62,63,64)'

(图片来源: 晶泰科技团队)

In [29]:

#example2: 根据PDB的范围进行选择。
range_selector = ResidueIndexSelector('42H-60H')
residue_selector = range_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [30]:

# 可视化选择 一定范围的残基
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(range_selector)
prefix = 'range_select_'
pymol_selected.apply(pose, prefix)

In [31]:

string_metric = sm_data.get_string_metric_data()
string_metric['range_select_pymol_selection']

Out[31]:

'select rosetta_sele, (chain H and resid 42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60)'

(图片来源: 晶泰科技团队)

3.2.2. ResidueNameSelector

通过氨基酸的具体残基名定义的选择器:

In [32]:

#example1: 根据单个残基名进行选择:
from pyrosetta.rosetta.core.select.residue_selector import *
resname_selector = ResidueNameSelector('PHE')
residue_selector = resname_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]

In [33]:

# 可视化选择 根据残基名选择的残基
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(resname_selector)
prefix = 'resname_select_'
pymol_selected.apply(pose, prefix)

In [34]:

string_metric = sm_data.get_string_metric_data()
string_metric['resname_select_pymol_selection']

Out[34]:

'select rosetta_sele, (chain H and resid 27,79,100) or (chain L and resid 21,49,62,83,87,96,98)'

(图片来源: 晶泰科技团队)

In [35]:

#example2:  根据多个残基名进行选择:
resname_selector = ResidueNameSelector('PHE,ASN')
residue_selector = resname_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]

In [36]:

# 可视化选择多个残基名对应的残基
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(resname_selector)
prefix = 'multi_resname_select_'
pymol_selected.apply(pose, prefix)

In [37]:

string_metric = sm_data.get_string_metric_data()
string_metric['multi_resname_select_pymol_selection']

Out[37]:

'select rosetta_sele, (chain H and resid 27,54,60,76,79,100) or (chain L and resid 21,31,34,49,62,77,83,87,96,98)'

(图片来源: 晶泰科技团队)

In [38]:

#example3:  选择带修饰的氨基酸残基
resname_selector = ResidueNameSelector('CYS')
residue_selector = resname_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

好像出现了问题，残基选择器似乎没有正确地选择我所需要的二硫键残基。让我们打印21号残基的信息，看看出了什么问题？

In [39]:

print(pose.residue(21))

Residue 21: CYS:disulfide (CYS, C):
Base: CYS
 Properties: POLYMER PROTEIN CANONICAL_AA SC_ORBITALS METALBINDING DISULFIDE_BONDED ALPHA_AA L_AA
 Variant types: DISULFIDE
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O    H    HA 
 Side-chain atoms:  CB   SG  1HB  2HB 
Atom Coordinates:
   N  : 39.126, 55.553, 42.324
   CA : 37.869, 55.182, 41.689
   C  : 37.774, 53.665, 41.73
   O  : 38.654, 52.976, 41.209
   CB : 37.81, 55.713, 40.253
   SG : 36.265, 55.41, 39.34
   H  : 39.995, 55.343, 41.854
   HA : 37.051, 55.626, 42.256
  1HB : 37.967, 56.792, 40.257
  2HB : 38.614, 55.268, 39.667
Mirrored relative to coordinates in ResidueType: FALSE

结果解读
选择带二硫键的氨基酸时，使用CYS残基名并没有正确选择到对应的氨基酸，因为在Rosetta中，形成二硫键的半胱氨酸名为 CYS:disulfide, 接下来我们尝试换个名字进行选择.

In [40]:

resname_selector = ResidueNameSelector('CYS:disulfide')
residue_selector = resname_selector.apply(pose)
print(residue_selector)

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [41]:

# 可视化选择 带修饰的残基
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(resname_selector)
prefix = 'ss_select_'
pymol_selected.apply(pose, prefix)

In [42]:

string_metric = sm_data.get_string_metric_data()
string_metric['ss_select_pymol_selection']

Out[42]:

'select rosetta_sele, (chain H and resid 22,92) or (chain L and resid 23,88)'

(图片来源: 晶泰科技团队)

结果解读
现在可以正确选择到对应的二硫键氨基酸子集了！这些二硫键的位置是22H, 92H, 23L, 88L。

3.3 构象依赖的选择器¶

顾名思义，这类选择器与分子结构的具体构象有关，具体地由二面角、二级结构、氢键、邻居分子数量、相互作用界面、对称性等几个层次去进行定义。这里以NeighborhoodResidueSelector为例进行简要说明。

3.3.1. NeighborhoodResidueSelector

选择邻近残基，默认选择10埃范围内的残基。有两种用法来选择，第一种选择半径范围内所有的氨基酸，第二种为选择邻近范围内的氨基酸

In [43]:

# 比如选择PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸(包括42号氨基酸):
from pyrosetta.rosetta.core.select.residue_selector import NeighborhoodResidueSelector, ResidueIndexSelector
residue1_selector = ResidueIndexSelector('42H')
nbr_selector = NeighborhoodResidueSelector(residue1_selector, 10.0, True)  # True 代表包括42号氨基酸。
nbr_selector.apply(pose)

core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ################ Cloning pose and building neighbor graph ################
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] Ensure that pose is either scored or has update_residue_neighbors() called
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] before using NeighborhoodResidueSelector for maximum performance!
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ##########################################################################

Out[43]:

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [44]:

# 可视化选择PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸且含42号氨基酸
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(nbr_selector)
prefix = 'nbr_select_'
pymol_selected.apply(pose, prefix)

core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ################ Cloning pose and building neighbor graph ################
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] Ensure that pose is either scored or has update_residue_neighbors() called
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] before using NeighborhoodResidueSelector for maximum performance!
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ##########################################################################

In [45]:

string_metric = sm_data.get_string_metric_data()
string_metric['nbr_select_pymol_selection']

Out[45]:

'select rosetta_sele, (chain H and resid 39,40,41,42,43,44,88,89)'

(图片来源: 晶泰科技团队)

In [46]:

# 比如选择PDB编号为H链42号氨基酸10埃范围内所有的氨基酸(不包括42号氨基酸):
nbr_selector = NeighborhoodResidueSelector(residue1_selector, 10.0, False)  # True 代表包括1号氨基酸。
nbr_selector.apply(pose)

core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ################ Cloning pose and building neighbor graph ################
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] Ensure that pose is either scored or has update_residue_neighbors() called
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] before using NeighborhoodResidueSelector for maximum performance!
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ##########################################################################

Out[46]:

vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [47]:

# 可视化选择 PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸但不含42号氨基酸
pymol_selected = SelectedResiduesPyMOLMetric()
pymol_selected.set_residue_selector(nbr_selector)
prefix = 'nbr_noself_select_'
pymol_selected.apply(pose, prefix)

core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ################ Cloning pose and building neighbor graph ################
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] Ensure that pose is either scored or has update_residue_neighbors() called
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] before using NeighborhoodResidueSelector for maximum performance!
core.select.residue_selector.NeighborhoodResidueSelector: {0} [ WARNING ] ##########################################################################

In [48]:

string_metric = sm_data.get_string_metric_data()
string_metric['nbr_noself_select_pymol_selection']

Out[48]:

'select rosetta_sele, (chain H and resid 39,40,41,43,44,88,89)'

(图片来源: 晶泰科技团队)

In [ ]: