在前一个章节中,我们已经介绍了在Rosetta中,Residue对象是描述蛋白质的基本单元,许多独立的Residue被用于描述蛋白质的几何结构和高级构象。 这些细微的构象变化的度量就是由原子间的键长、键角,二面角等一系列的具体参数构成。在PyRosetta中, Pose中这些几何构象的参数由Conformation对象负责记录。
蛋白质是由多个氨基酸通过脱水缩合的方式形成肽键共价连接。因此对于天然氨基酸而言。骨架最重要的两个二面角就是phi和psi角,而omega由于肽键平面一般处于0或180°附近。对于不同的氨基酸侧链,每个Residue含有若干个chi角。这些二面角组成了Rosetta对构象采样的基本几何参数。
# 读取多肽的PDB结构
from pyrosetta import init, pose_from_pdb
init()
pose = pose_from_pdb('./data/4jfx_peptide.pdb')
PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org (C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team. core.init: {0} Checking for fconfig files in pwd and ./rosetta/flags core.init: {0} Rosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12 core.init: {0} command: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database basic.random.init_random_generator: {0} 'RNG device' seed mode, using '/dev/urandom', seed=265248943 seed_offset=0 real_seed=265248943 thread_index=0 basic.random.init_random_generator: {0} RandomGenerator:init: Normal mode, seed=265248943 RG_type=mt19937 core.chemical.GlobalResidueTypeSet: {0} Finished initializing fa_standard residue type set. Created 983 residue types core.chemical.GlobalResidueTypeSet: {0} Total time to initialize 0.67614 seconds. core.import_pose.import_pose: {0} File './data/4jfx_peptide.pdb' automatically determined to be of type PDB
从pyrosetta中获取这些基本几何参数的方式非常简单:
# 获取第3号氨基酸的骨架二面角:
phi = pose.phi(3)
psi = pose.psi(3)
omega = pose.psi(3)
print(phi, psi, omega)
-148.8025511852094 157.0624491251048 157.0624491251048
也可以直接获取侧链的二面角参数:
pose.chi(χ角编号, Residue的pose编号)即可获取。
# 获取第三号残基的chi角信息: pose.chi(chi_id:int, residue_id:int)
chi_num = len(pose.residue(3).chi_atoms())
print(chi_num)
3
残基中共有3个χ二面角。
# 打印每个chi二面角的
for chi_id in range(1, chi_num+1):
chi_angle = pose.chi(chi_id, 3)
print(chi_angle)
70.4233120899352 81.97278776078356 -2.023205011879734e-14
Pose对象中内置的几个函数非常方便地可以用于调整几何构象: set_phi, set_psi, set_omega, set_chi。
# 调整骨架二面角
pose.set_phi(seqpos=3, setting=-150)
pose.set_phi(seqpos=3, setting=170)
pose.dump_pdb('./data/4jfx_peptide_conf0.pdb')
True
调整构象后的多肽构象直观感受:
# 调整3号氨基酸的侧链chi1角的角度;
pose.set_chi(chino=1, seqpos=3, setting=60)
pose.dump_pdb('./data/4jfx_peptide_chi_conf0.pdb')
pose.set_chi(chino=1, seqpos=3, setting=-60)
pose.dump_pdb('./data/4jfx_peptide_chi_conf1.pdb')
pose.set_chi(chino=1, seqpos=3, setting=180)
pose.dump_pdb('./data/4jfx_peptide_chi_conf2.pdb')
True
除了对构象变化依赖影响最大的二面角参数,局部的化学键键长和键角信息也储存在Conformation对象中。 为了定位原子的信息,首先需要构建atom identifier对象,相当于创建一个ID卡,让Rosetta知道我们指定的原子是位于哪个氨基酸中的。通过AtomID,提供残基号,原子号,就可以创建atom identifier对象
# 获取原子间的键长、键角信息前需要构建atom identifier objects
from pyrosetta.rosetta.core.id import AtomID
atom1 = AtomID(atomno_in=1, rsd_in=3) # 3号残基的第一个原子
atom2 = AtomID(atomno_in=2, rsd_in=3) # 3号残基的第二个原子
atom3 = AtomID(atomno_in=3, rsd_in=3) # 3号残基的第三个原子
atom4 = AtomID(atomno_in=4, rsd_in=3) # 3号残基的第四个原子
print(atom1)
print(atom2)
print(atom3)
print(atom4)
atomno= 1 rsd= 3 atomno= 2 rsd= 3 atomno= 3 rsd= 3 atomno= 4 rsd= 3
知道原子的ID后,就可以通过conformation对象来获取键长、键角等数据了。但一般这些参数在Rosetta中键长和键角都设定为理想值,可以极大减少蛋白质构象的采样自由度空间。但注意的是,获取的键长键角必须是有“物理连接的”。
# 通过conformation层获取键长数据
bond_length = pose.conformation().bond_length(atom1, atom2)
# 通过conformation层获取键角数据(弧度)
bond_angle = pose.conformation().bond_angle(atom1, atom2, atom3)
print(f'键长:{bond_length}, 键角:{bond_angle}')
键长:1.4632750937537378, 键角:1.9840915800459624
同样原子间的键长和键角也是可以被调整的:
# 设置新的值:
pose.conformation().set_bond_length(atom1, atom2, setting=1.5) # 设置键长
pose.conformation().set_bond_angle(atom1, atom2, atom3, setting=3.4) # 设置键角,弧度,而非角度
# 查看新的值设定情况:
new_bond_length = pose.conformation().bond_length(atom1, atom2)
new_bond_angle = pose.conformation().bond_angle(atom1, atom2, atom3)
print(new_bond_length, new_bond_angle)
1.5 3.4
原子坐标的修改需要获取residue对象,并获取原子ID(atom identifier objects)。通过pose.set_xyz函数设定新的xyz坐标, 但用户一般不需要”显式“地修改原子坐标, 除非你明白这样操作的意义。此处我们沿着3个坐标轴平移所有原子3个埃的距离。
# 原子坐标的修改(一般不需要这样操作)
from pyrosetta.rosetta.numeric import xyzVector_double_t
# 对所有氨基酸的所有原子的x坐标乘上一个负号:
for residue_id in range(1, pose.total_residue()+1):
residue = pose.residue(residue_id) # 获取residue对象
for atom_id, atom in enumerate(residue.atoms()):
x, y, z = atom.xyz()
# 镜像处理xyz坐标:
trans_xyz = xyzVector_double_t(x+3, y+3, z+3) # 平移+3埃.
atom_index = AtomID(atom_id+1, residue_id) # 3号氨基酸的第x个原子的id
pose.set_xyz(atom_index, trans_xyz) # 设置xyz坐标
pose.dump_pdb('./data/trans.pdb')
True
Rosetta中的二级结构信息来源于DSSP的计算。通过get_secstruct函数即可获取。
# 通过DSSP获取二级结构信息
from pyrosetta.rosetta.protocols.membrane import get_secstruct
ss = ''.join(get_secstruct(pose))
print(ss)
protocols.DsspMover: {0} LLLLLLLL LLLLLLLL
在晶体中,可能会存在一些非理想二面角、键长、键角等。可以通过IdealizeMover进行修复。
from pyrosetta.rosetta.protocols.idealize import IdealizeMover
# idealized
idm = IdealizeMover()
idm.apply(pose)
protocols.idealize.IdealizeMover: {0} total atompairs: 0 protocols.idealize: {0} lastjumpmin: 1 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 2 N 0.075 0.000 0.000 0.000 0.000 1.416 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 2 N 0.039 4.106 5.979 2.375 4.050 0.283 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 28.336 0.283 --------------------------------------------------- Total weighted score: 0.283 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 3 Y 2.619 2.531 5.979 0.950 4.050 1095.920 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 3 Y 0.656 46.121 77.515 28.485 109.577 56.651 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 5665.102 56.651 --------------------------------------------------- Total weighted score: 56.651 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 4 V 0.657 44.944 118.620 23.737 109.577 56.780 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 4 V 0.657 44.944 118.629 23.737 109.577 56.747 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 5674.731 56.747 --------------------------------------------------- Total weighted score: 56.747 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 5 V 0.654 34.395 118.629 20.346 109.577 57.267 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 5 V 0.656 34.377 118.361 20.353 109.577 56.568 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 5656.769 56.568 --------------------------------------------------- Total weighted score: 56.568 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 6 T 0.667 27.842 118.361 15.830 109.577 60.651 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 6 T 0.655 27.895 118.453 15.832 109.577 56.600 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 5660.035 56.600 --------------------------------------------------- Total weighted score: 56.600 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 8 A 0.670 26.155 118.453 15.832 109.577 57.959 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 8 A 0.670 26.155 118.453 15.832 109.577 57.957 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 5795.692 57.957 --------------------------------------------------- Total weighted score: 57.957 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 7 Y 0.796 22.061 118.453 11.874 109.577 71.358 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 7 Y 0.796 22.060 118.455 11.875 109.577 71.337 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 7133.705 71.337 --------------------------------------------------- Total weighted score: 71.337 protocols.idealize: {0} premin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 1 G 0.796 21.383 118.455 11.875 109.577 71.315 protocols.idealize: {0} postmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 1 G 0.796 21.383 118.457 11.875 109.577 71.307 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 7130.685 71.307 --------------------------------------------------- Total weighted score: 71.307 protocols.idealize: {0} pre-finalmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 0.796 21.383 118.457 11.875 109.577 71.307 protocols.idealize: {0} post-finalmin: (pos,rmsd,avg-bb,max-bb,avg-chi,max-chi,score) 0.796 21.383 118.459 11.875 109.576 71.292 protocols.idealize: {0} ------------------------------------------------------------ Scores Weight Raw Score Wghtd.Score ------------------------------------------------------------ pro_close 0.500 0.000 0.000 dslf_ss_dst 0.500 0.000 0.000 dslf_cs_ang 2.000 0.000 0.000 coordinate_constraint 0.010 7129.185 71.292 --------------------------------------------------- Total weighted score: 71.292 protocols.idealize: {0} protocols.idealize.IdealizeMover: {0} RMS between original pose and idealised pose: 0.524355 CA RMSD, 0.811604 All-Atom RMSD,
理想化之后,的All-Atom RMSD发生了轻微的变化。