Notebook

Gaussian input/output treatments¶

using pymatgen library¶

Germain Salvato Vallverdu, March 23, 2023

import used modules and objects¶

Load some pymatgen specific classes and module

In [1]:

import pymatgen.core

In [2]:

pymatgen.core.__version__

Out[2]:

'2023.3.23'

In [3]:

from pymatgen.core import Molecule, Site
from pymatgen.io.gaussian import GaussianInput, GaussianOutput

Image class of IPython for rendering pictures.

In [4]:

from IPython.display import Image

Generate gaussian inputs¶

Hereafter is the docstring of the GaussianInput class.

In [3]:

help(GaussianInput)

Help on class GaussianInput in module pymatgen.io.gaussian:

class GaussianInput(builtins.object)
 |  GaussianInput(mol, charge=None, spin_multiplicity=None, title=None, functional='HF', basis_set='6-31G(d)', route_parameters=None, input_parameters=None, link0_parameters=None, dieze_tag='#P', gen_basis=None)
 |  
 |  An object representing a Gaussian input file.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, mol, charge=None, spin_multiplicity=None, title=None, functional='HF', basis_set='6-31G(d)', route_parameters=None, input_parameters=None, link0_parameters=None, dieze_tag='#P', gen_basis=None)
 |      Args:
 |          mol: Input molecule. It can either be a Molecule object,
 |              a string giving the geometry in a format supported by Gaussian,
 |              or ``None``. If the molecule is ``None``, you will need to use
 |              read it in from a checkpoint. Consider adding ``CHK`` to the
 |              ``link0_parameters``.
 |          charge: Charge of the molecule. If None, charge on molecule is used.
 |              Defaults to None. This allows the input file to be set a
 |              charge independently from the molecule itself.
 |              If ``mol`` is not a Molecule object, then you must specify a charge.
 |          spin_multiplicity: Spin multiplicity of molecule. Defaults to None,
 |              which means that the spin multiplicity is set to 1 if the
 |              molecule has no unpaired electrons and to 2 if there are
 |              unpaired electrons. If ``mol`` is not a Molecule object, then you
 |               must specify the multiplicity
 |          title: Title for run. Defaults to formula of molecule if None.
 |          functional: Functional for run.
 |          basis_set: Basis set for run.
 |          route_parameters: Additional route parameters as a dict. For example,
 |              {'SP':"", "SCF":"Tight"}
 |          input_parameters: Additional input parameters for run as a dict. Used
 |              for example, in PCM calculations. E.g., {"EPS":12}
 |          link0_parameters: Link0 parameters as a dict. E.g., {"%mem": "1000MW"}
 |          dieze_tag: # preceding the route line. E.g. "#p"
 |          gen_basis: allows a user-specified basis set to be used in a Gaussian
 |              calculation. If this is not None, the attribute ``basis_set`` will
 |              be set to "Gen".
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  as_dict(self)
 |      :return: MSONable dict
 |  
 |  get_cart_coords(self)
 |      Return the Cartesian coordinates of the molecule
 |  
 |  get_zmatrix(self)
 |      Returns a z-matrix representation of the molecule.
 |  
 |  to_string(self, cart_coords=False)
 |      Return GaussianInput string
 |      
 |      Option: when cart_coords is set to True return the Cartesian coordinates
 |              instead of the z-matrix
 |  
 |  write_file(self, filename, cart_coords=False)
 |      Write the input string into a file
 |      
 |      Option: see __str__ method
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  from_dict(d) from builtins.type
 |      :param d: dict
 |      :return: GaussianInput
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  from_file(filename)
 |      Creates GaussianInput from a file.
 |      
 |      Args:
 |          filename: Gaussian input filename
 |      
 |      Returns:
 |          GaussianInput object
 |  
 |  from_string(contents)
 |      Creates GaussianInput from a string.
 |      
 |      Args:
 |          contents: String representing an Gaussian input file.
 |      
 |      Returns:
 |          GaussianInput object
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |  
 |  molecule
 |      Returns molecule associated with this GaussianInput.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

Examples¶

In this example, we get a $PO_2N^{2-}$ molecule and look for the best place for two Li atoms in the plane of the molecule. This example shows how to set up a Gaussian input file for all possible combinations. Coordinates of this molecule are available in xyz format :

In [5]:

mol = Molecule.from_file("data_nglview/PO2N.xyz")
print(mol.to(fmt="xyz"))
Image("data_nglview/PO2N.png")

4
P1 N1 O2
P -2.714301 0.000003 0.056765
O -1.926896 -1.284476 0.128633
O -1.925428 1.283516 0.129485
N -4.181288 0.000879 -0.079861

Out[5]:

First we set up the three guessed positions for Li atoms in the plane of the molecule. Each position is along the bissector of two PO or PN bonds. We build a list of pymatgen.Site object which represent an atom with its symbol, coordinates ....

In [6]:

posLi = [Site("Li", [-0.424, -0.001,  0.267]),
         Site("Li", [-3.789, -2.031, -0.046]),
         Site("Li", [-3.787,  2.032, -0.044])]
print(posLi)

[Site: Li (-0.4240, -0.0010, 0.2670), Site: Li (-3.7890, -2.0310, -0.0460), Site: Li (-3.7870, 2.0320, -0.0440)]

Now for each combinaison of two Li atoms among three possibilities we will write a gaussian input file. Look at the combinations method of itertools module.

In [7]:

from itertools import combinations

In [8]:

for sites in combinations(posLi, 2):
    print(sites, "\t indices : ", [posLi.index(site) for site in sites])

(Site: Li (-0.4240, -0.0010, 0.2670), Site: Li (-3.7890, -2.0310, -0.0460)) 	 indices :  [0, 1]
(Site: Li (-0.4240, -0.0010, 0.2670), Site: Li (-3.7870, 2.0320, -0.0440)) 	 indices :  [0, 2]
(Site: Li (-3.7890, -2.0310, -0.0460), Site: Li (-3.7870, 2.0320, -0.0440)) 	 indices :  [1, 2]

Hereafter is the loop over combinations which will write all input files.

In [9]:

# Gaussian keywords
route_parms = {"opt": "tight", "Freq": ""}
link0 = {"%Nproc": "5"}
DFT = "B3LYP"

# the loop
for sites in combinations(posLi, 2):
    # load the molecule
    mol = Molecule.from_file("data_nglview/PO2N.xyz")
    
    # set up a label according to the combination
    title = "combination_" + "".join([str(posLi.index(site)) for site in sites])
    print(title)
    
    # add the two Li atoms to the molecule
    for site in sites:
        mol.append(site.specie, site.coords)
        
    # set up the calculation
    gau = GaussianInput(mol, charge=0, spin_multiplicity=1, title=title, 
                        functional=DFT, route_parameters=route_parms, link0_parameters=link0)
    gau.write_file(title + ".com", cart_coords=True)

combination_01
combination_02
combination_12

We can also load an input file and get some information.

In [10]:

gau = GaussianInput.from_file("combination_01.com")
print(gau.basis_set, gau.functional)

6-31G(d) B3LYP

In [11]:

print(gau.to_string(cart_coords=True))

%Nproc=5
#P B3LYP/6-31G(d) Freq opt=tight

combination_01

0 1
P -2.714301 0.000003 0.056765
O -1.926896 -1.284476 0.128633
O -1.925428 1.283516 0.129485
N -4.181288 0.000879 -0.079861
Li -0.424000 -0.001000 0.267000
Li -3.789000 -2.031000 -0.046000

Post-treatments of Gaussian outputs¶

Docstring of the `GaussianOutput` class¶

In [11]:

help(GaussianOutput)

Help on class GaussianOutput in module pymatgen.io.gaussian:

class GaussianOutput(builtins.object)
 |  GaussianOutput(filename)
 |  
 |  Parser for Gaussian output files.
 |  
 |  .. note::
 |  
 |      Still in early beta.
 |  
 |  Attributes:
 |  .. attribute:: structures
 |  
 |      All structures from the calculation in the standard orientation. If the
 |      symmetry is not considered, the standard orientation is not printed out
 |      and the input orientation is used instead. Check the `standard_orientation`
 |      attribute.
 |  
 |  .. attribute:: structures_input_orientation
 |  
 |      All structures from the calculation in the input orientation or the
 |      Z-matrix orientation (if an opt=z-matrix was requested).
 |  
 |  .. attribute:: opt_structures
 |  
 |      All optimized structures from the calculation in the standard orientation,
 |      if the attribute 'standard_orientation' is True, otherwise in the input
 |      or the Z-matrix orientation.
 |  
 |  .. attribute:: energies
 |  
 |      All energies from the calculation.
 |  
 |  .. attribute:: eigenvalues
 |  
 |      List of eigenvalues for the last geometry
 |  
 |  .. attribute:: MO_coefficients
 |  
 |      Matrix of MO coefficients for the last geometry
 |  
 |  .. attribute:: cart_forces
 |  
 |      All Cartesian forces from the calculation.
 |  
 |  .. attribute:: frequencies
 |  
 |      A list for each freq calculation and for each mode of a dict with
 |      {
 |          "frequency": freq in cm-1,
 |          "symmetry": symmetry tag
 |          "r_mass": Reduce mass,
 |          "f_constant": force constant,
 |          "IR_intensity": IR Intensity,
 |          "mode": normal mode
 |       }
 |  
 |      The normal mode is a 1D vector of dx, dy dz of each atom.
 |  
 |  .. attribute:: hessian
 |  
 |      Matrix of second derivatives of the energy with respect to cartesian
 |      coordinates in the **input orientation** frame. Need #P in the
 |      route section in order to be in the output.
 |  
 |  .. attribute:: properly_terminated
 |  
 |      True if run has properly terminated
 |  
 |  .. attribute:: is_pcm
 |  
 |      True if run is a PCM run.
 |  
 |  .. attribute:: is_spin
 |  
 |      True if it is an unrestricted run
 |  
 |  .. attribute:: stationary_type
 |  
 |      If it is a relaxation run, indicates whether it is a minimum (Minimum)
 |      or a saddle point ("Saddle").
 |  
 |  .. attribute:: corrections
 |  
 |      Thermochemical corrections if this run is a Freq run as a dict. Keys
 |      are "Zero-point", "Thermal", "Enthalpy" and "Gibbs Free Energy"
 |  
 |  .. attribute:: functional
 |  
 |      Functional used in the run.
 |  
 |  .. attribute:: basis_set
 |  
 |      Basis set used in the run
 |  
 |  .. attribute:: route
 |  
 |      Additional route parameters as a dict. For example,
 |          {'SP':"", "SCF":"Tight"}
 |  
 |  .. attribute:: dieze_tag
 |  
 |      # preceding the route line, e.g. "#P"
 |  
 |  .. attribute:: link0
 |  
 |      Link0 parameters as a dict. E.g., {"%mem": "1000MW"}
 |  
 |  .. attribute:: charge
 |  
 |      Charge for structure
 |  
 |  .. attribute:: spin_multiplicity
 |  
 |      Spin multiplicity for structure
 |  
 |  .. attribute:: num_basis_func
 |  
 |      Number of basis functions in the run.
 |  
 |  .. attribute:: electrons
 |  
 |      number of alpha and beta electrons as (N alpha, N beta)
 |  
 |  .. attribute:: pcm
 |  
 |      PCM parameters and output if available.
 |  
 |  .. attribute:: errors
 |  
 |      error if not properly terminated (list to be completed in error_defs)
 |  
 |  .. attribute:: Mulliken_charges
 |  
 |      Mulliken atomic charges
 |  
 |  .. attribute:: eigenvectors
 |  
 |      Matrix of shape (num_basis_func, num_basis_func). Each column is an
 |      eigenvectors and contains AO coefficients of an MO.
 |  
 |      eigenvectors[Spin] = mat(num_basis_func, num_basis_func)
 |  
 |  .. attribute:: molecular_orbital
 |  
 |      MO development coefficients on AO in a more convenient array dict
 |      for each atom and basis set label.
 |  
 |      mo[Spin][OM j][atom i] = {AO_k: coeff, AO_k: coeff ... }
 |  
 |  .. attribute:: atom_basis_labels
 |  
 |      Labels of AO for each atoms. These labels are those used in the output
 |      of molecular orbital coefficients (POP=Full) and in the
 |      molecular_orbital array dict.
 |  
 |      atom_basis_labels[iatom] = [AO_k, AO_k, ...]
 |  
 |  .. attribute:: resumes
 |  
 |      List of gaussian data resume given at the end of the output file before
 |      the quotation. The resumes are given as string.
 |  
 |  .. attribute:: title
 |  
 |      Title of the gaussian run.
 |  
 |  .. attribute:: standard_orientation
 |  
 |      If True, the geometries stored in the structures are in the standard
 |      orientation. Else, the geometries are in the input orientation.
 |  
 |  .. attribute:: bond_orders
 |  
 |      Dict of bond order values read in the output file such as:
 |      {(0, 1): 0.8709, (1, 6): 1.234, ...}
 |  
 |      The keys are the atom indexes and the values are the Wiberg bond indexes
 |      that are printed using `pop=NBOREAD` and `$nbo bndidx $end`.
 |  
 |  Methods:
 |  .. method:: to_input()
 |  
 |      Return a GaussianInput object using the last geometry and the same
 |      calculation parameters.
 |  
 |  .. method:: read_scan()
 |  
 |      Read a potential energy surface from a gaussian scan calculation.
 |  
 |  .. method:: get_scan_plot()
 |  
 |      Get a matplotlib plot of the potential energy surface
 |  
 |  .. method:: save_scan_plot()
 |  
 |      Save a matplotlib plot of the potential energy surface to a file
 |  
 |  Methods defined here:
 |  
 |  __init__(self, filename)
 |      Args:
 |          filename: Filename of Gaussian output file.
 |  
 |  as_dict(self)
 |      JSON-serializable dict representation.
 |  
 |  get_scan_plot(self, coords=None)
 |      Get a matplotlib plot of the potential energy surface.
 |      
 |      Args:
 |          coords: internal coordinate name to use as abscissa.
 |  
 |  get_spectre_plot(self, sigma=0.05, step=0.01)
 |      Get a matplotlib plot of the UV-visible xas. Transitions are plotted
 |      as vertical lines and as a sum of normal functions with sigma with. The
 |      broadening is applied in energy and the xas is plotted as a function
 |      of the wavelength.
 |      
 |      Args:
 |          sigma: Full width at half maximum in eV for normal functions.
 |          step: bin interval in eV
 |      
 |      Returns:
 |          A dict: {"energies": values, "lambda": values, "xas": values}
 |                  where values are lists of abscissa (energies, lamba) and
 |                  the sum of gaussian functions (xas).
 |          A matplotlib plot.
 |  
 |  read_excitation_energies(self)
 |      Read a excitation energies after a TD-DFT calculation.
 |      
 |      Returns:
 |          A list: A list of tuple for each transition such as
 |                  [(energie (eV), lambda (nm), oscillatory strength), ... ]
 |  
 |  read_scan(self)
 |      Read a potential energy surface from a gaussian scan calculation.
 |      
 |      Returns:
 |          A dict: {"energies": [ values ],
 |                   "coords": {"d1": [ values ], "A2", [ values ], ... }}
 |      
 |          "energies" are the energies of all points of the potential energy
 |          surface. "coords" are the internal coordinates used to compute the
 |          potential energy surface and the internal coordinates optimized,
 |          labelled by their name as defined in the calculation.
 |  
 |  save_scan_plot(self, filename='scan.pdf', img_format='pdf', coords=None)
 |      Save matplotlib plot of the potential energy surface to a file.
 |      
 |      Args:
 |          filename: Filename to write to.
 |          img_format: Image format to use. Defaults to EPS.
 |          coords: internal coordinate name to use as abcissa.
 |  
 |  save_spectre_plot(self, filename='spectre.pdf', img_format='pdf', sigma=0.05, step=0.01)
 |      Save matplotlib plot of the spectre to a file.
 |      
 |      Args:
 |          filename: Filename to write to.
 |          img_format: Image format to use. Defaults to EPS.
 |          sigma: Full width at half maximum in eV for normal functions.
 |          step: bin interval in eV
 |  
 |  to_input(self, mol=None, charge=None, spin_multiplicity=None, title=None, functional=None, basis_set=None, route_parameters=None, input_parameters=None, link0_parameters=None, dieze_tag=None, cart_coords=False)
 |      Create a new input object using by default the last geometry read in
 |      the output file and with the same calculation parameters. Arguments
 |      are the same as GaussianInput class.
 |      
 |      Returns:
 |          gaunip (GaussianInput) : the gaussian input object
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |  
 |  final_energy
 |      :return: Final energy in Gaussian output.
 |  
 |  final_structure
 |      :return: Final structure in Gaussian output.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

Examples :¶

Used the GaussianOutput class to read a Gaussian output file.

In [12]:

logfile = GaussianOutput("data_nglview/config_0123.log")

Is termination Normal ?

In [13]:

logfile.properly_terminated

Out[13]:

True

Display final energy or mulliken charges :

In [14]:

logfile.final_energy

Out[14]:

-1239.01264675

In [15]:

logfile.Mulliken_charges

Out[15]:

{1: ['P', 1.751758],
 2: ['P', 1.203271],
 3: ['O', -0.722993],
 4: ['O', -0.771242],
 5: ['O', -0.804145],
 6: ['O', -0.771148],
 7: ['O', -0.633752],
 8: ['O', -0.633729],
 9: ['O', -0.519418],
 10: ['Li', 0.528585],
 11: ['Li', 0.528596],
 12: ['Li', 0.421974],
 13: ['Li', 0.422242]}

You can extract the coordinates of the final structure. All structures are pymatgen.Molecule objects, see here or the reference guide. You can do lot of stuff with this object such as : neighbors list, compute distances or the distance matrix, symmetry operation, atomic substitution ...

In [16]:

print(logfile.final_structure.to(fmt="xyz"))

13
Li4 P2 O7
P 1.803152 -0.000062 -0.074603
P -2.714301 0.000003 0.056765
O 0.711492 0.000020 -1.166577
O 1.626159 1.347642 0.712312
O 3.338973 -0.000234 -0.476901
O 1.625851 -1.347548 0.712612
O -1.926896 -1.284476 0.128633
O -1.925428 1.283516 0.129485
O -4.181288 0.000879 -0.079861
Li 3.387513 1.698635 0.310029
Li 3.387174 -1.698958 0.310641
Li -0.135176 -1.487921 -0.211882
Li -0.134065 1.489076 -0.212137

You would prefer a gaussian input format with a z-matrix of the 42th geometrical optimization step.

In [17]:

print(logfile.structures[41].to(fmt="gjf"))

#P HF/6-31G(d) 

Li4 P2 O7

0 1
P
P 1 B1
O 1 B2 2 A2
O 1 B3 3 A3 2 D3
O 1 B4 4 A4 3 D4
O 1 B5 5 A5 3 D5
O 2 B6 3 A6 6 D6
O 2 B7 7 A7 3 D7
O 2 B8 7 A8 8 D8
Li 4 B9 5 A9 1 D9
Li 6 B10 5 A10 1 D10
Li 7 B11 3 A11 6 D11
Li 8 B12 4 A12 3 D12

B1=4.376288
B2=1.543372
A2=36.067295
B3=1.572484
A3=105.757762
D3=-63.587256
B4=1.587877
A4=103.530048
D4=127.302261
B5=1.570339
A5=103.613837
D5=118.342854
B6=1.509599
A6=70.094274
D6=-30.800762
B7=1.509898
A7=116.635764
D7=-54.758827
B8=1.474749
A8=121.618553
D8=-179.642556
B9=1.844669
A9=48.464768
D9=-173.301681
B10=1.844418
A10=48.529403
D10=176.422774
B11=1.836204
A11=36.882001
D11=-26.625480
B12=1.836768
A12=14.624477
D12=-64.000507

You can plot the geometrical convergence (here using plotly).

In [18]:

import plotly.graph_objs as go

In [19]:

fig = go.Figure([
    go.Scatter(x=[i for i in range(len(logfile.energies))], y=logfile.energies)
],
    layout=go.Layout(
        xaxis=dict(title="step"), 
        yaxis=dict(title="Energy (ua)"),
        template="plotly_white",
        height=500,
    )
)
fig.show()

In [20]:

import matplotlib.pyplot as plt

In [21]:

plt.plot(logfile.energies, "o--")
plt.xlabel("step")
plt.ylabel("Energy (ua)")

Out[21]:

Text(0, 0.5, 'Energy (ua)')