Welcome to the tutorial for ChemCoord (http://chemcoord.readthedocs.org/).
import chemcoord as cc
from chemcoord.xyz_functions import get_rotation_matrix
import numpy as np
import time
water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1)
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1)
middle = cc.Cartesian.read_xyz('MIL53_middle.xyz', start_index=1)
Let's have a look at it:
water
atom | x | y | z | |
---|---|---|---|---|
1 | O | 0.000000 | 0.0 | 0.000000 |
2 | H | 0.758602 | 0.0 | 0.504284 |
3 | H | 0.260455 | 0.0 | -0.872893 |
4 | O | 3.000000 | 0.5 | 0.000000 |
5 | H | 3.758602 | 0.5 | 0.504284 |
6 | H | 3.260455 | 0.5 | -0.872893 |
It is also possible to open it with an external viewer. I use Molcas gv.exe
so you have to change it accordingly to your program of choice.
water.view(viewer='gv.exe')
To make this setting permament, execute:
cc.settings['defaults']['viewer'] = 'gv.exe' # replace by your viewer of choice
The slicing operations are the same as for pandas.DataFrames
. (http://pandas.pydata.org/pandas-docs/stable/indexing.html)
If the 'x'
axis is of particular interest you can slice it out with:
water['x']
# or explicit label based indexing
water.loc[:, 'x']
# or explicit integer based indexing
water.iloc[:, 1]
# or selection directly as an attribute
# this is only possible, if no other conflicting method/attribute exists
water.x
1 0.000000 2 0.758602 3 0.260455 4 3.000000 5 3.758602 6 3.260455 Name: x, dtype: float64
With boolean slicing it is very easy to cut all the oxygens away:
water[water.atom != 'O'].view()
This can be combined with other selections:
water[(water.atom != 'O') & (water.x < 1)].view()
The indexing behaves like Indexing and Selecting data in
Pandas.
You can slice with Cartesian.loc[key]
, Cartesian.iloc[keys]
, and Cartesian[key]
.
The only question is about the return type.
If the information in the columns is enough to draw a molecule,
an instance of the own class (e.g. Cartesian
)
is returned.
If the information in the columns is not enough to draw a molecule, there
are two cases to consider:
A pandas.Series
instance is returned for one dimensional slices
A pandas.DataFrame
instance is returned in all other cases:
molecule.loc[:, ['atom', 'x', 'y', 'z']] returns a `Cartesian`.
molecule.loc[:, ['atom', 'x']]`` returns a `pandas.DataFrame`.
molecule.loc[:, 'atom']`` returns a `pandas.Series`.
Two general rules are:
Have a look at the unmodified molecule
middle.view()
Chain the methods:
middle.cut_sphere(radius=5, preserve_bonds=False).view()
The molecule itself remains unchanged.
middle.view()
One really important method is get_bonds()
.
It returns a connectivity table, which is represented by a dictionary.
Each index points to set of indices, that are connected to it.
water.get_bonds()
{1: {2, 3}, 2: {1}, 3: {1}, 4: {5, 6}, 5: {4}, 6: {4}}
Now the focus switches to another molecule (MIL53_middle)
Let's explore the coordinationsphere of the Cr atom with the index 7.
for i in range(3):
middle.get_coordination_sphere(13, n_sphere=i, only_surface=False).view()
time.sleep(1)
Binary operators are supported in the logic of the scipy stack, but you need
python3.x for using the matrix multiplication operator @
.
The general rule is that mathematical operations using the binary operators
(+ - * / @)
and the unary operatos (+ - abs)
are only applied to the ['x', 'y', 'z']
columns.
Addition/Subtraction/Multiplication/Division:
If you add a scalar to a Cartesian it is added elementwise onto the
['x', 'y', 'z']
columns.
If you add a 3-dimensional vector, list, tuple... the first element of this
vector is added elementwise to the 'x'
column of the
Cartesian instance and so on.
The last possibility is to add a matrix with
shape=(len(Cartesian), 3)
which is again added elementwise.
The same rules are true for subtraction, division and multiplication.
Matrixmultiplication:
Only leftsided multiplication with a matrix of shape=(n, 3)
,
where n
is a natural number, is supported.
The usual usecase is for example
np.diag([1, 1, -1]) @ cartesian_instance
to mirror on the x-y plane.
Note that if A
is the matrix which is multiplied from the left, and X
is the shape=(n, 3)
-matrix
consisting of the ['x', 'y', 'z']
columns. The actual calculation is:
(A @ X.T).T
(water + 3).view()
(get_rotation_matrix([1, 0, 0], np.radians(90)) @ water).view()
The comparison operators ==
and !=
are supported and require molecules indexed in the same way:
In some cases it is better to test for numerical equality $ |a - b| < \epsilon$. This is done using
allclose
or isclose
(elementwise)
water == water + 1e-15
atom | x | y | z | |
---|---|---|---|---|
1 | True | False | False | False |
2 | True | False | False | False |
3 | True | False | False | False |
4 | True | False | False | False |
5 | True | False | False | False |
6 | True | False | False | False |
cc.xyz_functions.isclose(water, water + 1e-15)
atom | x | y | z | |
---|---|---|---|---|
1 | True | True | True | True |
2 | True | True | True | True |
3 | True | True | True | True |
4 | True | True | True | True |
5 | True | True | True | True |
6 | True | True | True | True |
cc.xyz_functions.allclose(water, water + 1e-15)
True
It is possible to use symbolic expressions from sympy.
import sympy
sympy.init_printing()
x = sympy.Symbol('x')
symb_water = water.copy()
symb_water['x'] = [x + i for i in range(len(symb_water))]
symb_water
atom | x | y | z | |
---|---|---|---|---|
1 | O | $x$ | 0.0 | 0.000000 |
2 | H | $x + 1$ | 0.0 | 0.504284 |
3 | H | $x + 2$ | 0.0 | -0.872893 |
4 | O | $x + 3$ | 0.5 | 0.000000 |
5 | H | $x + 4$ | 0.5 | 0.504284 |
6 | H | $x + 5$ | 0.5 | -0.872893 |
symb_water.subs(x, 2)
atom | x | y | z | |
---|---|---|---|---|
1 | O | 2.0 | 0.0 | 0.000000 |
2 | H | 3.0 | 0.0 | 0.504284 |
3 | H | 4.0 | 0.0 | -0.872893 |
4 | O | 5.0 | 0.5 | 0.000000 |
5 | H | 6.0 | 0.5 | 0.504284 |
6 | H | 7.0 | 0.5 | -0.872893 |
symb_water.subs(x, 2).view()
moved = get_rotation_matrix([1, 2, 3], 1.1) @ middle + 15
moved.view()
m1, m2 = middle.align(moved)
cc.xyz_functions.view([m1, m2])
# If your viewer of choice does not support molden files, you have to call separately:
# m1.view()
# m2.view()
It is possible to detect the point group and symmetrize a molecule. Let's distort a $C_{2,v}$ symmetric molecule and symmetrize it back:
np.random.seed(77)
dist_molecule = small.copy()
dist_molecule += np.random.randn(len(dist_molecule), 3) / 25
dist_molecule.get_pointgroup(tolerance=0.1)
C1
eq = dist_molecule.symmetrize(max_n=25, tolerance=0.3, epsilon=1e-5)
eq['sym_mol'].get_pointgroup(tolerance=0.1)
C2v
a, b = small.align(dist_molecule)
a, c = small.align(eq['sym_mol'])
d1 = (a - b).get_distance_to()
d2 = (a - c).get_distance_to()
cc.xyz_functions.view([a, b, c])
# If your viewer of choice does not support molden files, you have to call separately:
# a.view()
# b.view()
# c.view()
As we can see, the symmetrised molecule is a lot more similar to the original molecule. The average deviation from the original positions decreased by 35 %.
(d1['distance'].sum() - d2['distance'].sum()) / d1['distance'].sum()