Back to the main Index

First (basic) lesson with Abinit and AbiPy

The H2 molecule


This lesson aims at showing how to get the following physical properties:

  • the (pseudo) total energy
  • the bond length
  • the charge density
  • the atomisation energy

This tutorial is a complement to the standard ABINIT tutorial on H$_2$. Here, powerful flow and visualisation procedures will be demonstrated. Still, some basic understanding of the stand-alone working of ABINIT is a prerequisite. Also, in order to fully benefit from this Abipy tutorial, other more basic Abipy tutorials should have been followed, as suggested in the abitutorials index page.

There are three methodologies to compute the optimal distance between the two Hydrogen atoms. One could:

  • compute the total energy for different values of the interatomic distance, make a fit through the different points, and determine the minimum of the fitting function;
  • compute the forces for different values of the interatomic distance, make a fit through the different values, and determine the zero of the fitting function;
  • use an automatic algorithm for minimizing the energy (or finding the zero of forces).

In this AbiPy notebook, we will be focusing on the first approach. More specifically we will build an AbiPy Flow to compute the energy and the forces in the $H_2$ molecule for different values of the interatomic distance. This exercise will allow us to learn how to generate multiple input files using AbiPy and how to analyze multiple ground-state calculations with the AbiPy robots.

Table of Contents

[back to top]

Our first AbiPy function

[back to top]

In [2]:
# Use this at the beginning of your script so that your code will be compatible with python3
from __future__ import print_function, division, unicode_literals

import numpy as np  

import warnings 
warnings.filterwarnings("ignore")  # Ignore warnings

from abipy import abilab
abilab.enable_notebook() # This line tells AbiPy we are running inside a notebook

# This line configures matplotlib to show figures embedded in the notebook.
# Replace `inline` with `notebook` in classic notebook
%matplotlib inline   

# Option available in jupyterlab. See https://github.com/matplotlib/jupyter-matplotlib
#%matplotlib widget 

We need a function that generates an input file for a GS calculations for the $H_2$ molecule in a big box. Ideally a function that receives the distance x, the cutoff energy ecut and the size of the big box in input so that we can customize the output and generate multiple input objects easily.

Fortunately we already have such a function in the lesson_base1.py module. Let's import it and look at the code:

In [3]:
from lesson_base1 import gs_input
abilab.print_source(gs_input)
Out[3]:

def gs_input(x=0.7, ecut=10, acell=(10, 10, 10)):
    """
    This function builds an AbinitInput object to compute the total energy
    of the H2 molecule in a big box.

    Args:
        x: Position of the first Hydrogen along the x-axis in Cartesian coordinates.
           The second Hydrogen is located at [-x, 0, 0]
        ecut: Cutoff energy in Ha.
        acell: Lengths of the primitive vectors (in Bohr)

    Returns:
        AbinitInput object.
    """
    # Build structure from dictionary with input variables.
    structure = abilab.Structure.from_abivars(
        ntypat=1,                           # There is only one type of atom.
        znucl=1,                            # Atomic numbers of the type(s) of atom.
        natom=2,                            # There are two atoms.
        typat=(1, 1),                       # They both are of type 1, that is, Hydrogen.
        xcart=[-x, 0.0, 0.0,                # Cartesian coordinates of atom 1, in Bohr.
               +x, 0.0, 0.0],               # second atom.
        acell=acell,                        # Lengths of the primitive vectors (in Bohr).
        rprim=[1, 0, 0, 0, 1, 0, 0, 0, 1]   # Orthogonal primitive vectors (default).
    )

    # Build AbinitInput from structure and pseudo(s) taken from AbiPy package.
    inp = abilab.AbinitInput(structure=structure, pseudos=abidata.pseudos("01h.pspgth"))

    # Set value of other variables.
    inp.set_vars(
        ecut=ecut,
        nband=1,
        diemac=2.0,
        toldfe=1e-6,
        prtwf=-1,
        iomode=3
    )

    # Define k-point sampling.
    inp.set_kmesh(ngkpt=(1, 1, 1), shiftk=(0, 0, 0))

    return inp

If the function is called without arguments, the default values (specified in the prototype) are used. Let's try:

In [4]:
gsinp = gs_input()
print("The value of ecut is:", gsinp["ecut"])
The value of ecut is: 10

The AbinitInput is a dict-like object whose usage is documented in this notebook. Inside jupyter, we can get the HTML representation of the input with:

In [5]:
gsinp
Out[5]:
############################################################################################
# SECTION: basic
############################################################################################
ecut 10
nband 1
toldfe 1e-06
ngkpt 1 1 1
kptopt 1
nshiftk 1
shiftk 0 0 0
############################################################################################
# SECTION: dev
############################################################################################
iomode 3
############################################################################################
# SECTION: files
############################################################################################
prtwf -1
############################################################################################
# SECTION: gstate
############################################################################################
diemac 2.0
############################################################################################
# STRUCTURE
############################################################################################
natom 2
ntypat 1
typat 1 1
znucl 1
xred
-0.0700000000 0.0000000000 0.0000000000
0.0700000000 0.0000000000 0.0000000000
acell 1.0 1.0 1.0
rprim
10.0000000000 0.0000000000 0.0000000000
0.0000000000 10.0000000000 0.0000000000
0.0000000000 0.0000000000 10.0000000000

The input object can be converted into a string. More importantly, an AbinitInput has an AbiPy structure (see Structure notebook), a list of pseudopotential objects and provides several methods to facilitate the specification of input variables.

In [6]:
print(gsinp.structure)
print("The big box volume is:", gsinp.structure.volume)
Full Formula (H2)
Reduced Formula: H2
abc   :   5.291772   5.291772   5.291772
angles:  90.000000  90.000000  90.000000
Sites (2)
  #  SP        a    b    c
---  ----  -----  ---  ---
  0  H     -0.07    0    0
  1  H      0.07    0    0
The big box volume is: 148.18471127642286
In [7]:
gsinp.structure.plot();

Let's print some info about our pseudopotentials:

In [8]:
print(gsinp.pseudos[0])
<NcAbinitPseudo: 01h.pspgth>
  summary: Goedecker-Teter-Hutter  Wed May  8 14:27:44 EDT 1996
  number of valence electrons: 1.0
  maximum angular momentum: s
  angular momentum for local part: s
  XC correlation: LDA_XC_TETER93
  supports spin-orbit: False
  radius for non-linear core correction: 0.0
  hint for low accuracy: ecut: 0.0, pawecutdg: 0.0
  hint for normal accuracy: ecut: 0.0, pawecutdg: 0.0
  hint for high accuracy: ecut: 0.0, pawecutdg: 0.0

Computation of the interatomic distance

[back to top]

At this point, we can use gs_input to generate an Abinit Flow to compute the total energy and the forces of H-H with different interatomic distances. We have already prepared such a function in build_flow, let's have a look at the code:

In [9]:
from lesson_base1 import build_flow
abilab.print_source(build_flow)
Out[9]:

def build_flow(options):
    """
    Generate a flow to compute the total energy and forces for the H2 molecule in a big box
    as a function of the interatomic distance.

    Args:
        options: Command line options.

    Return:
        Flow object.
    """
    inputs = [gs_input(x=x) for x in np.linspace(0.5, 1.025, 21)]

    workdir = options.workdir if (options and options.workdir) else "flow_h2"

    return flowtk.Flow.from_inputs(workdir, inputs)

Note that we are working at fixed ecut and acell, only the H-H distance is modified. Let's call the function to build our flow:

In [10]:
flow = build_flow(options=None)
flow.get_graphviz()
Out[10]:
flow Flow, node_id=345716, workdir=flow_h2 clusterw0 Work (w0) w0_t0 w0_t0 ScfTask w0_t1 w0_t1 ScfTask w0_t2 w0_t2 ScfTask w0_t3 w0_t3 ScfTask w0_t4 w0_t4 ScfTask w0_t5 w0_t5 ScfTask w0_t6 w0_t6 ScfTask w0_t7 w0_t7 ScfTask w0_t8 w0_t8 ScfTask w0_t9 w0_t9 ScfTask w0_t10 w0_t10 ScfTask w0_t11 w0_t11 ScfTask w0_t12 w0_t12 ScfTask w0_t13 w0_t13 ScfTask w0_t14 w0_t14 ScfTask w0_t15 w0_t15 ScfTask w0_t16 w0_t16 ScfTask w0_t17 w0_t17 ScfTask w0_t18 w0_t18 ScfTask w0_t19 w0_t19 ScfTask w0_t20 w0_t20 ScfTask

Ok, so far so sood. With just three lines of codes and our gs_input function, we managed to construct an AbiPy flow for the $H_2$ molecule. Let's write some python code to check that we really obtained what we had in mind:

In [11]:
inputs = [task.input for task in flow.iflat_tasks()]

print("ecuts:\n", [inp["ecut"] for inp in inputs])

print("vols:\n", ["%.1f" % inp.structure.volume for inp in inputs])

def hh_dist(structure):
    return np.linalg.norm(structure.cart_coords[1] - structure.cart_coords[0])

from pprint import pprint
print("h-h [Ang]:\n", ["%.3f" % hh_dist(inp.structure) for inp in inputs])
ecuts:
 [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
vols:
 ['148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2', '148.2']
h-h [Ang]:
 ['0.529', '0.557', '0.585', '0.613', '0.640', '0.668', '0.696', '0.724', '0.751', '0.779', '0.807', '0.835', '0.863', '0.890', '0.918', '0.946', '0.974', '1.001', '1.029', '1.057', '1.085']

At this point, we could run the flow in the notebook by just calling:

flow.make_scheduler().start()

or, alternatively, execute the lesson_base1.py script to build the directory with the flow and then use:

abirun.py flow_h2 scheduler

inside the terminal.

Let's assume the flow has been already executed and let's focus on the analysis of the final results.

Analyzing the main output file

[back to top]

First of all, it is always a good idea to check whether the SCF cycle is converged. Obviously one could open the main output file, find the SCF iterations and look for warnings but there is a much faster (and better) way to do that with AbiPy:

In [12]:
abo = abilab.abiopen("flow_h2/w0/t0/run.abo")
print(abo)
ndtset: 1, completed: True
Full Formula (H2)
Reduced Formula: H2
abc   :   5.291772   5.291772   5.291772
angles:  90.000000  90.000000  90.000000
Sites (2)
  #  SP        a    b    c
---  ----  -----  ---  ---
  0  H     -0.05    0    0
  1  H      0.05    0    0

Abinit Spacegroup: spgid: 123, num_spatial_symmetries: 16, has_timerev: True, symmorphic: False

========================= Dimensions of calculation =========================
         intxc  ionmov  iscf  lmnmax  lnmax  mgfft  mpssoang  mqgrid  natom  \
dataset                                                                       
1            0       0     7       1      1     30         1    3001      2   

         nloc_mem  nspden  nspinor  nsppol  nsym  n1xccc  ntypat  occopt  \
dataset                                                                    
1               1       1        1       1    16       0       1       1   

         xclevel  mband  mffmem  mkmem  mpw   nfft  nkpt  mem_per_proc_mb  \
dataset                                                                     
1              1      1       1      1  752  27000     1            7.796   

         wfk_size_mb  denpot_size_mb spg_symbol  spg_number  \
dataset                                                       
1              0.013           0.208     P4/mmm         123   

                                bravais  
dataset                                  
1        Bravais tP (primitive tetrag.)  
 

To get the list of Warnings/Comments/Errors:

In [13]:
print(abo.events)
Events found in /Users/gmatteo/git_repos/abitutorials/abitutorials/base1/flow_h2/w0/t0/run.abo

num_errors: 0, num_warnings: 0, num_comments: 0, completed: False

To plot the SCF cycle, use:

In [14]:
abo.plot();
Loading all matplotlib figures before showing them. It may take some time...
All figures in memory, elapsed time: 0.872 s

Since this is not a structural relaxation, the initial and final structures must be equal:

In [15]:
abo.initial_structure == abo.final_structure
Out[15]:
True

The basic dimensions and parameters of the run can be extracted from the output file with:

In [16]:
abo.get_dims_spginfo_dataset()
Out[16]:
(OrderedDict([(1,
               OrderedDict([('intxc', 0),
                            ('ionmov', 0),
                            ('iscf', 7),
                            ('lmnmax', 1),
                            ('lnmax', 1),
                            ('mgfft', 30),
                            ('mpssoang', 1),
                            ('mqgrid', 3001),
                            ('natom', 2),
                            ('nloc_mem', 1),
                            ('nspden', 1),
                            ('nspinor', 1),
                            ('nsppol', 1),
                            ('nsym', 16),
                            ('n1xccc', 0),
                            ('ntypat', 1),
                            ('occopt', 1),
                            ('xclevel', 1),
                            ('mband', 1),
                            ('mffmem', 1),
                            ('mkmem', 1),
                            ('mpw', 752),
                            ('nfft', 27000),
                            ('nkpt', 1),
                            ('mem_per_proc_mb', 7.796),
                            ('wfk_size_mb', 0.013),
                            ('denpot_size_mb', 0.208)]))]),
 OrderedDict([(1,
               {'bravais': 'Bravais tP (primitive tetrag.)',
                'spg_number': 123,
                'spg_symbol': 'P4/mmm'})]))

Within the shell, one can use:

abiview.py abo flow_h2/w0/t0/run.abo

to plot the SCF cycle or

abiopen.py flow_h2/w0/t0/run.abo

to open the file and start the ipython terminal

Extracting results from the GSR files

[back to top]

The ground-state results are saved in the GSR.nc files whose API is extensively discussed in the GSR notebook.

Let's have a look at the results produced by the first task:

In [17]:
with abilab.abiopen("flow_h2/w0/t0/outdata/out_GSR.nc") as gsr:
    print(gsr)
================================= File Info =================================
Name: out_GSR.nc
Directory: /Users/gmatteo/git_repos/abitutorials/abitutorials/base1/flow_h2/w0/t0/outdata
Size: 8.20 kb
Access Time: Sun Aug 12 00:14:53 2018
Modification Time: Tue Oct 10 21:27:35 2017
Change Time: Tue Oct 10 21:27:35 2017

================================= Structure =================================
Full Formula (H2)
Reduced Formula: H2
abc   :   5.291772   5.291772   5.291772
angles:  90.000000  90.000000  90.000000
Sites (2)
  #  SP        a    b    c  cartesian_forces
---  ----  -----  ---  ---  --------------------------------------------------
  0  H     -0.05    0    0  [-19.54779666  -0.          -0.        ] eV ang^-1
  1  H      0.05    0    0  [19.54779666 -0.         -0.        ] eV ang^-1

Abinit Spacegroup: spgid: 123, num_spatial_symmetries: 16, has_timerev: True, symmorphic: False

Stress tensor (Cartesian coordinates in GPa):
[[-10.75762969   0.           0.        ]
 [  0.           1.60903288   0.        ]
 [  0.           0.           1.60903288]]
Pressure: 2.513 (GPa)
Energy: -28.21337426 (eV)

============================== Electronic Bands ==============================
Number of electrons: 2.0, Fermi level: -11.082 (eV)
nsppol: 1, nkpt: 1, mband: 1, nspinor: 1, nspden: 1
smearing scheme: none, tsmear_eV: 0.272, occopt: 1
Bandwidth: 0.000 (eV)
Valence maximum located at:
    spin=0, kpt=[+0.000, +0.000, +0.000], weight: 1.000, band=0, eig=-11.082, occ=2.000

As we can see from the previous output, the GSR file contains information about the crystalline structure, forces, stresses as well as the KS band structure. In the jargon of object-oriented programming, one says that a GSRFile has a Structure object:

    gsr.structure

and has an ElectronBands object:

    gsr.ebands

This means that if you learn how to use the methods provided by structure and ebands, then you can easily get these objects from the GSR file and use this API to post-process the results. This is a general philosophy of AbiPy: every netcdf file object returned by abiopen contains other objects (the structure is always available, while the presence of other objects depend of the particular file). Remember this point because we will use it in the other lessons.

Ok, now we know how to open and extract information from one GSR file. In this tutorial, however, we need to analyze multiple GSR files! If you are familiar with python, it should not be difficult to write a for loop that iterates over a list of GSR files, extracts the total energy with the corresponding volume and creates two lists that can be used to plot $E(d_{H-H})$. This kind of operations are, however, very common and AbiPy provides a high-level interface (robots) to operate on multiple files and post-process the data.

In the simplest case, the Robot finds all files of a particular type located within a directory tree, stores all the data in memory and exposes methods to extract/post-process the results.

In [18]:
robot = abilab.GsrRobot.from_dir("flow_h2")
robot
Out[18]:
  1. w0/t0/outdata/out_GSR.nc
  2. w0/t1/outdata/out_GSR.nc
  3. w0/t10/outdata/out_GSR.nc
  4. w0/t11/outdata/out_GSR.nc
  5. w0/t12/outdata/out_GSR.nc
  6. w0/t13/outdata/out_GSR.nc
  7. w0/t14/outdata/out_GSR.nc
  8. w0/t15/outdata/out_GSR.nc
  9. w0/t16/outdata/out_GSR.nc
  10. w0/t17/outdata/out_GSR.nc
  11. w0/t18/outdata/out_GSR.nc
  12. w0/t19/outdata/out_GSR.nc
  13. w0/t2/outdata/out_GSR.nc
  14. w0/t20/outdata/out_GSR.nc
  15. w0/t3/outdata/out_GSR.nc
  16. w0/t4/outdata/out_GSR.nc
  17. w0/t5/outdata/out_GSR.nc
  18. w0/t6/outdata/out_GSR.nc
  19. w0/t7/outdata/out_GSR.nc
  20. w0/t8/outdata/out_GSR.nc
  21. w0/t9/outdata/out_GSR.nc
In [19]:
table = robot.get_dataframe()

The table contains several columns:

In [20]:
table.keys()
Out[20]:
Index(['formula', 'natom', 'alpha', 'beta', 'gamma', 'a', 'b', 'c', 'volume',
       'abispg_num', 'spglib_symb', 'spglib_num', 'spglib_lattice_type',
       'energy', 'pressure', 'max_force', 'ecut', 'pawecutdg', 'tsmear',
       'nkpt', 'nsppol', 'nspinor', 'nspden'],
      dtype='object')

Inside the notebook, we can visualize the table with:

In [21]:
table
Out[21]:
formula natom alpha beta gamma a b c volume abispg_num ... energy pressure max_force ecut pawecutdg tsmear nkpt nsppol nspinor nspden
w0/t0/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -28.213374 2.513188 19.547797 10.0 -1.0 0.01 1 1 1 1
w0/t1/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -28.697681 1.960239 15.457585 10.0 -1.0 0.01 1 1 1 1
w0/t10/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.091282 -1.106892 0.050198 10.0 -1.0 0.01 1 1 1 1
w0/t11/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.081045 -1.306064 0.669857 10.0 -1.0 0.01 1 1 1 1
w0/t12/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.054897 -1.486643 1.198657 10.0 -1.0 0.01 1 1 1 1
w0/t13/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.015128 -1.650521 1.652942 10.0 -1.0 0.01 1 1 1 1
w0/t14/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.963621 -1.799429 2.045489 10.0 -1.0 0.01 1 1 1 1
w0/t15/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.901949 -1.934773 2.386232 10.0 -1.0 0.01 1 1 1 1
w0/t16/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.831440 -2.058169 2.682742 10.0 -1.0 0.01 1 1 1 1
w0/t17/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.753241 -2.170734 2.940714 10.0 -1.0 0.01 1 1 1 1
w0/t18/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.668361 -2.273618 3.164346 10.0 -1.0 0.01 1 1 1 1
w0/t19/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.577707 -2.367714 3.356837 10.0 -1.0 0.01 1 1 1 1
w0/t2/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.079272 1.463278 12.125932 10.0 -1.0 0.01 1 1 1 1
w0/t20/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.482112 -2.453928 3.520449 10.0 -1.0 0.01 1 1 1 1
w0/t3/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.377087 1.016114 9.404751 10.0 -1.0 0.01 1 1 1 1
w0/t4/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.606392 0.612993 7.176415 10.0 -1.0 0.01 1 1 1 1
w0/t5/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.779514 0.249295 5.346041 10.0 -1.0 0.01 1 1 1 1
w0/t6/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.906409 -0.079134 3.837290 10.0 -1.0 0.01 1 1 1 1
w0/t7/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -29.995127 -0.375908 2.588480 10.0 -1.0 0.01 1 1 1 1
w0/t8/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.052173 -0.644270 1.549818 10.0 -1.0 0.01 1 1 1 1
w0/t9/outdata/out_GSR.nc H2 2 90.0 90.0 90.0 5.291772 5.291772 5.291772 148.184711 123 ... -30.082807 -0.887078 0.681054 10.0 -1.0 0.01 1 1 1 1

21 rows × 23 columns

Great! We managed to get a nice table with lot of useful results with just 3 lines of code and the robot! There are however two problems:

  • The rows of the table are not ordered by increasing H-H distance (files are sorted alphabetically)

  • Our dataframe contains the energy of the different configurations but we would like to plot the energy as a function of the H-H distance

Well, robots can do a lot of hard work but they are a little bit stupid so we have to tell them what to do with the data. More specifically we need a way to tell the robot that, for each GSR file, it should get the crystalline structure, compute the distance between the first and the second atom and insert the result in our table in a given column. This kind of tasks are usually executed with callbacks i.e. functions that are passed in input and automatically executed by the framework at runtime.

Let's look at the documentation of robot.get_dataframe:

In [22]:
abilab.print_doc(robot.get_dataframe)
Out[22]:

    def get_dataframe(self, with_geo=True, abspath=False, funcs=None, **kwargs):
        """
        Return a |pandas-DataFrame| with the most important GS results.
        and the filenames as index.

        Args:
            with_geo: True if structure info should be added to the dataframe
            abspath: True if paths in index should be absolute. Default: Relative to getcwd().

        kwargs:
            attrs:
                List of additional attributes of the |GsrFile| to add to the DataFrame.
            funcs: Function or list of functions to execute to add more data to the DataFrame.
                Each function receives a |GsrFile| object and returns a tuple (key, value)
                where key is a string with the name of column and value is the value to be inserted.
        """

It seems complicated but the actual implementation of the callback is just three lines of code:

In [23]:
def hh_dist(gsr):
    """
    This callback receives a GSR file and computes the H-H distance.
    The robot will call this function to compute the H-H distance, 
    and return a (key, value) tuple that will be inserted in the pandas DataFrame.
    """
    cart_coords = gsr.structure.cart_coords
    d = np.linalg.norm(cart_coords[1] - cart_coords[0])
    return "hh_dist", d

with abilab.GsrRobot.from_dir("flow_h2") as robot:
    table = robot.get_dataframe(funcs=hh_dist)
    table = table.sort_values(by="hh_dist")  

As expected, now the table contains a new column with hh_dist in Angstrom:

In [24]:
"hh_dist" in table
Out[24]:
True

Let's print the two columns with the H-H distance and the total energy:

In [25]:
table[["hh_dist", "energy"]]
Out[25]:
hh_dist energy
w0/t0/outdata/out_GSR.nc 0.529177 -28.213374
w0/t1/outdata/out_GSR.nc 0.556959 -28.697681
w0/t2/outdata/out_GSR.nc 0.584741 -29.079272
w0/t3/outdata/out_GSR.nc 0.612523 -29.377087
w0/t4/outdata/out_GSR.nc 0.640304 -29.606392
w0/t5/outdata/out_GSR.nc 0.668086 -29.779514
w0/t6/outdata/out_GSR.nc 0.695868 -29.906409
w0/t7/outdata/out_GSR.nc 0.723650 -29.995127
w0/t8/outdata/out_GSR.nc 0.751432 -30.052173
w0/t9/outdata/out_GSR.nc 0.779213 -30.082807
w0/t10/outdata/out_GSR.nc 0.806995 -30.091282
w0/t11/outdata/out_GSR.nc 0.834777 -30.081045
w0/t12/outdata/out_GSR.nc 0.862559 -30.054897
w0/t13/outdata/out_GSR.nc 0.890341 -30.015128
w0/t14/outdata/out_GSR.nc 0.918122 -29.963621
w0/t15/outdata/out_GSR.nc 0.945904 -29.901949
w0/t16/outdata/out_GSR.nc 0.973686 -29.831440
w0/t17/outdata/out_GSR.nc 1.001468 -29.753241
w0/t18/outdata/out_GSR.nc 1.029250 -29.668361
w0/t19/outdata/out_GSR.nc 1.057031 -29.577707
w0/t20/outdata/out_GSR.nc 1.084813 -29.482112

Note that the energy in our DataFrame is given in eV to facilitate the integration with pymatgen that uses eV for energies and Angstrom for lengths. Let's add another column to our table with energies in Hartree:

In [26]:
table["energy_Ha"] = table["energy"] * abilab.units.eV_to_Ha

and use the plot method of pandas DataFrames to plot energy_Ha vs hh_dist

In [27]:
table.plot(x="hh_dist", y="energy_Ha", style="-o");

At this point, it should be clear that to plot the maximum of the forces as a function of the H-H distance we just need:

In [28]:
table.plot(x="hh_dist", y="max_force", style="-o");

Want to plot the two quantities on the same figure?

In [29]:
table.plot(x="hh_dist", y=["energy_Ha", "max_force"], subplots=True);

Your boss understands the data only if it is formatted inside a $\LaTeX$ tabular environment?

In [30]:
print(table[["hh_dist", "energy"]].to_latex())
\begin{tabular}{lrr}
\toprule
{} &   hh\_dist &     energy \\
\midrule
w0/t0/outdata/out\_GSR.nc  &  0.529177 & -28.213374 \\
w0/t1/outdata/out\_GSR.nc  &  0.556959 & -28.697681 \\
w0/t2/outdata/out\_GSR.nc  &  0.584741 & -29.079272 \\
w0/t3/outdata/out\_GSR.nc  &  0.612523 & -29.377087 \\
w0/t4/outdata/out\_GSR.nc  &  0.640304 & -29.606392 \\
w0/t5/outdata/out\_GSR.nc  &  0.668086 & -29.779514 \\
w0/t6/outdata/out\_GSR.nc  &  0.695868 & -29.906409 \\
w0/t7/outdata/out\_GSR.nc  &  0.723650 & -29.995127 \\
w0/t8/outdata/out\_GSR.nc  &  0.751432 & -30.052173 \\
w0/t9/outdata/out\_GSR.nc  &  0.779213 & -30.082807 \\
w0/t10/outdata/out\_GSR.nc &  0.806995 & -30.091282 \\
w0/t11/outdata/out\_GSR.nc &  0.834777 & -30.081045 \\
w0/t12/outdata/out\_GSR.nc &  0.862559 & -30.054897 \\
w0/t13/outdata/out\_GSR.nc &  0.890341 & -30.015128 \\
w0/t14/outdata/out\_GSR.nc &  0.918122 & -29.963621 \\
w0/t15/outdata/out\_GSR.nc &  0.945904 & -29.901949 \\
w0/t16/outdata/out\_GSR.nc &  0.973686 & -29.831440 \\
w0/t17/outdata/out\_GSR.nc &  1.001468 & -29.753241 \\
w0/t18/outdata/out\_GSR.nc &  1.029250 & -29.668361 \\
w0/t19/outdata/out\_GSR.nc &  1.057031 & -29.577707 \\
w0/t20/outdata/out\_GSR.nc &  1.084813 & -29.482112 \\
\bottomrule
\end{tabular}

Need to send data to Windows users?

In [31]:
#table.to_excel("'output.xlsx")

Want to copy the dataframe to the system clipboard so that one can easily past the data into an other applications e.g. Excel?

In [32]:
#table.to_clipboard()

Analysis of the charge density

[back to top]

The DEN.nc file stores the density in real space on the FFT mesh. A DEN.nc file has a structure, and an ebands object with the electronic eigenvalues/occupations and a Density object with $n(r)$ (numpy array .datar) and $n(G)$ (.datag).

Let's open the file with abiopen and print it:

In [33]:
with abilab.abiopen("flow_h2/w0/t10/outdata/out_DEN.nc") as denfile:
    print(denfile)
    density = denfile.density
================================= File Info =================================
Name: out_DEN.nc
Directory: /Users/gmatteo/git_repos/abitutorials/abitutorials/base1/flow_h2/w0/t10/outdata
Size: 217.43 kb
Access Time: Sun Aug 12 00:14:57 2018
Modification Time: Tue Oct 10 21:27:37 2017
Change Time: Tue Oct 10 21:27:37 2017

================================= Structure =================================
Full Formula (H2)
Reduced Formula: H2
abc   :   5.291772   5.291772   5.291772
angles:  90.000000  90.000000  90.000000
Sites (2)
  #  SP           a    b    c
---  ----  --------  ---  ---
  0  H     -0.07625    0    0
  1  H      0.07625    0    0

Abinit Spacegroup: spgid: 123, num_spatial_symmetries: 16, has_timerev: True, symmorphic: False

============================== Electronic Bands ==============================
Number of electrons: 2.0, Fermi level: -9.658 (eV)
nsppol: 1, nkpt: 1, mband: 1, nspinor: 1, nspden: 1
smearing scheme: none, tsmear_eV: 0.272, occopt: 1
Bandwidth: 0.000 (eV)
Valence maximum located at:
    spin=0, kpt=[+0.000, +0.000, +0.000], weight: 1.000, band=0, eig=-9.658, occ=2.000
XC functional: LDA_XC_TETER93
================================== Density ==================================
Density: nspinor: 1, nsppol: 1, nspden: 1
Mesh3D: nx=30, ny=30, nz=30 
Integrated electronic and magnetization densities in atomic spheres:
      symbol      ntot  rsph_ang           frac_coords
iatom                                                 
0          H  0.134448      0.31  [-0.07625, 0.0, 0.0]
1          H  0.134448      0.31   [0.07625, 0.0, 0.0]
Total magnetization from unit cell integration: 0.0

The simplest thing we can do now is to print $n(r)$ along a line passing through two points specified either in terms of two vectors or two integers defining the site index in our structure. Let's plot the density along the H-H bond by passing the index of the two atoms:

In [34]:
density.plot_line(0, 1);

Great! If we have a netcdf file and AbiPy, we don't need to use cut3d to extract the data from the file and we can do simple plots with matplotlib. Unfortunately, $n(r)$ is a 3D object and the notebook is not the most suitable tool to visualize this kind of dataset. Fortunately there are several graphical applications to visualize 3D fields in crystalline environments and AbiPy provides tools to export the data from netcdf to the text format supported by the external graphical tool.

For example, one can use:

In [35]:
#density.visualize("vesta")

to visualize density isosurfaces of our system:

Conclusions

[back to top]

To summarize, we learned how to define python functions that can be used to generate many input files easily. We briefly discussed how to use these inputs to build a basic AbiPy flow without dependencies. More importantly, we showed that AbiPy provides several tools that can be used to inspect and analyze the results without having to pass necessarly through the creation and execution of the Flow. Last but not least, we discussed how to use robots to collect results from the output files and store them in pandas DataFrames

AbiPy users are strongly recommended to familiarize themself with this kind of interface before moving to more advanced features such as the flow execution that requires a good understanding of the python language. As a matter of fact, we decided to write AbiPy in python not for efficiency reasons (actually python is usually slower that Fortran/C) but because there are tons of libraries for scientific applications (numpy, scipy, pandas, matplotlib, jupyter, etc). If you learn to use these great libraries for your work you can really boost your productivity and save a lot of time.

A logical next lesson would be the the tutorial about the ground-state properties of silicon

Back to the main Index

In [ ]: