These exercises provide a short look at the "numerical" and "computational" challenges of density-functional theory (DFT). For the work here we will use the density-functional toolkit (DFTK), a small Julia code for plane-wave DFT simulations.
Some modification of Julia code is required to do the exercises, but apart from that you can do the analysis of your numerical results in any form you want. But naturally this would be a good opportunity to learn some Julia ;).
Some help in case you get stuck and feel you would like to learn more:
For installation instructions on the packages required to run these exercises and some tips on using Julia on the RWTH Jupyter lab setup, see 1_Installation.ipynb.
By default Julia only using a single thread. But you can also create a Jupyter kernel for multithreaded Julia. It is highly recommended you do this for the exercises, see 1_Installation.ipynb for details.
Do this every time you start the notebook to initialise the threading inside DFTK:
# Setup DFTK
using DFTK
setup_threading(n_blas=1)
It was discussed in the lecture that plane-wave DFT codes usually employ so-called pseudopotentials (PSP) to model the interaction of electrons and nuclei. Replacing the true Coulombic interaction by a PSP is an approximation. Multiple types of PSPs exist and from a mathematical perspective little is understood about the errors introduced by PSPs. However, from a physical point of view using PSPs is reasonable as it only effects the electron density close to the nuclei, which for most phaenomena only plays a minor role. On the upside PSPs turn out to make plane-wave calculations much more feasible.
This aspect we will investigate numerically in this exercise. Our goal will be to determine the energy of a single neon atom using the so-called LDA approximation up to an absolute error of 0.1
. We will do this by an explicit convergence study, i.e. by increasing the basis size and improving the numerics until we are sure we have found the energy to this tolerance.
Our computational setup is simple: We will put a single neon atom into a cubic box of length $a$. As DFTK uses periodic boundary conditions, this atom interacts spuriously with the neighbouring cells, introducing an error to our calculation. As $a \to \infty$ this error disappears, thus in principle convergence with increasing $a$ should be one parameter to study. For simplicitity (as large values of $a$ lead to very costly calculations) we will ignore this aspect here.
The convergence parameter we will not ignore, however, is the $E_\text{cut}$ parameter discussed in the lecture. This parameter determines the basis set size, thus the accuracy of the calculation. Again calculations get better for $E_\text{cut} \to \infty$.
In the language of DFTK the calculation we want to perform is:
using LinearAlgebra
# Numerical parameters
#
Ecut = 20 # Plane-wave basis cutoff
# Use a HGH pseudopotential:
Ne = ElementPsp(:Ne, psp=load_psp("hgh/lda/ne-q8"))
# Use the true Coulomb interaction:
# Ne = ElementCoulomb(:Ne)
#
# End numerical parameters
# Setup Cubic box and place neon atom in the middle
a = 10 # Box size
lattice = a * Matrix(I, 3, 3)
atoms = [Ne]
positions = [[0.5, 0.5, 0.5]]
# Use LDA model, discretise and run SCF algorithm
model = model_LDA(lattice, atoms, positions)
basis = PlaneWaveBasis(model; Ecut, kgrid=[1, 1, 1])
scfres = self_consistent_field(basis)
Etot = scfres.energies.total
# Print total energy
println("\nTotal DFT energy is $Etot")
Warning: DFT calculations can be both time and memory consuming. When doing these convergence studies therefore start with small values for Ecut
(like the ones shown here) and increase slowly, especially if you are running these on a laptop with 8GB of main memory or less. You are not expected to re-compute the provided reference results.
(a) First stick with the pseudpotential (PSP) version of the neon atom (ElementPsp
). Converge the energy to an absolute error of 0.1
. The easiest way to do this is to run multiple calculations for different values of Ecut
and to plot the error against a reference on a semilog scale. A good reference is at Ecut = 600
, where the DFT energy is -34.85376
.
Hint: If you want to do the ploting within Julia good, take a look at the Plots.jl documentation for some nice examples. Another good plotting package could be PyPlot.jl, which has exactly the same interface as the matplotlib
Python package.
(b) Repeat the exercise for the all-electron case (ElementCoulomb
). Here the reference result at Ecut = 600
is -123.5983
. Explore the convergence all the way up to Ecut=100
. What differences in the obtained total energy as well as the convergence with Ecut
do you observe? Suggest an explanaition keeping in mind that part of the idea of the PSP is to avoid the explicit treatment of the core electrons.
As we discussed in 5_Solving_the_SCF_problem.ipynb the self-consistent field procedure required to solve the DFT problem can be written as a fixed-point problem $$ F(\rho) = \rho $$ where $F$ represents the basic SCF step. That is the construction of the Kohn-Sham Hamiltonian $H(\rho)$ given the density $\rho$, followed its diagonalisation to obtain its eigenpairs $(\varepsilon_{k i}, \psi_{ki})$ and from these a new density $$ \rho(r) = \sum_{k\in\text{BZ}} \sum_i f_{\varepsilon_F}(\varepsilon_{ki}) \, \psi_{ki}(r) \, \psi_{ki}^\ast(r)$$ with the Fermi level $\varepsilon_F$ chosen to conserve the number of electrons.
In this exercise we will take the function $F$ for "granted" (i.e. delivered by DFTK) and we will investigate multiple algorthms to find the fixed-point density satisfying $F(\rho) = \rho $. Our setting is defined by the following function, which builds and discretises an LDA model for an aluminium supercell, see 5_Solving_the_SCF_problem.ipynb for details.
using DFTK
using LinearAlgebra
function aluminium_setup_ex2(repeat=1; Ecut=13.0, kgrid=[2, 2, 2])
a = 7.65339
lattice = a * Matrix(I, 3, 3)
Al = ElementPsp(:Al, psp=load_psp("hgh/lda/al-q3"))
atoms = [Al, Al, Al, Al]
positions = [[0.0, 0.0, 0.0], [0.0, 0.5, 0.5], [0.5, 0.0, 0.5], [0.5, 0.5, 0.0]]
# Make supercell
supercell = ase_atoms(lattice, atoms, positions) * (repeat, 1, 1)
lattice = load_lattice(supercell)
positions = load_positions(supercell)
atoms = fill(Al, length(positions))
# Construct an LDA model and discretise
model = model_LDA(lattice, atoms, positions; temperature=1e-3, symmetries=false)
PlaneWaveBasis(model; Ecut, kgrid)
end
(a) Fixed-point iterations. The easiest way to solve $F(\rho) = \rho$ are fixed-point iterations, i.e. $$ \rho_{n+1} = F(\rho_n), $$ starting from a hopefully good initial guess $\rho_0$. DFTK automatically provides a reasonable guess density, such that we only need to take care of the iterations themselves. In the language of DFTK this algorithm is written as:
function fixed_point_iteration(F, ρ₀, maxiter; tol)
# F: The SCF step function
# ρ₀: The initial guess density
# maxiter: The maximal number of iterations to be performed
# tol: The selected convergence tolerance
ρ = ρ₀
Fρ = F(ρ)
for n = 1:maxiter
# If change less than tolerance, break iterations:
if norm(Fρ - ρ) < tol
break
end
ρ = Fρ
Fρ = F(ρ)
end
# Return some stuff DFTK needs ...
(fixpoint=ρ, converged=norm(Fρ-ρ) < tol)
end;
# use this algorithm inside DFTK's SCF for solving the silicon problem
# (the other parameters are needed to overwrite some DFTK defaults we don't want to use yet).
self_consistent_field(aluminium_setup_ex2(1); solver=fixed_point_iteration, damping=1.0,
maxiter=40, mixing=SimpleMixing());
As can be observed this algorithm is not very good and in fact even fails to converge albeit we are only looking at a very simple system.
This is a known limitation of this algorithm, which is why it is not used in practice. Instead one introduces a so-called damping parameter $\alpha$, which is given a value between $0$ and $1$. One now iterates as follows: $$ \rho_{n+1} = \rho_{n} + \alpha (F(\rho_n) - \rho_n) $$ In other words the update $F(\rho_n) - \rho_n$ proposed in the $n$-th SCF step is not fully taken, but scaled-down by the damping $\alpha$.
Modify fixed_point_iteration
such that it supports this damped fixed-point iteration. In other words implement damping inside your algorithm and not by changing the damping
parameter of the self_consistent_field
function driving the SCF.
Using your algorithm try different values for $\alpha$ between $0$ and $1$ and estimate roughly the $\alpha$ which gives fastest convergence. For which $\alpha$ do you observe no convergence at all?
Remark: The observations you make here are general. One can proove that every SCF converges (locally) if a small enough $\alpha > 0$ is chosen.
(b) Anderson acceleration. The fixed_point_iteration
function above (with the damping extension) actually already covers the main gist of the DFT algorithms used in practice. One additional idea to make things converge faster is Anderson acceleration, where not only $\rho_n$ and $F(\rho_n)$, but also older iterates are used to propose the next density.
For Anderson one exploits that the update $R(\rho) = F(\rho) - \rho$ is also the residual of the fixed-point problem $F(\rho) = \rho$, i.e. how far away we are from the fixed-point density. A good next density $\rho_{n+1}$ therefore should be found by minimising an approximation for $R(\rho_{n+1})$. Assuming the SCF was linear in the density (which it is not), a good idea is to find a linear combination of residuals $$\min_{\beta_i} \left\| \sum_i \beta_i R(\rho_i) \right\|^2$$ which has the smallest possible norm and to use these coefficients $\beta_i$ to extrapolate the next density $$ \rho_{n+1} = \sum_i \beta_i (\rho_i + \alpha R(\rho_i)) $$ where you notice the "standard" damped fixed-point iteration in the summed terms.
In terms of an algorithm this is
function anderson_iteration(F, ρ₀, maxiter; tol)
# F: The SCF step function
# ρ₀: The initial guess density
# maxiter: The maximal number of iterations to be performed
# tol: The selected convergence tolerance
converged = false
ρ = ρ₀
ρs = []
Rs = []
for n = 1:maxiter
Fρ = F(ρ)
Rρ = Fρ - ρ
converged = norm(Rρ) < tol
converged && break
ρnext = vec(ρ) .+ vec(Rρ)
if !isempty(Rs)
M = hcat(Rs...) .- vec(Rρ)
βs = -(M \ vec(Rρ))
for (iβ, β) in enumerate(βs)
ρnext .+= β .* (ρs[iβ] .- vec(ρ) .+ Rs[iβ] .- vec(Rρ))
end
end
push!(ρs, vec(ρ))
push!(Rs, vec(Rρ))
ρ = reshape(ρnext, size(ρ₀)...)
end
# Return some stuff DFTK needs ...
(fixpoint=ρ, converged=converged)
end
To work with this algorithm we will use DFTK's intrinsic mechanism to choose a damping. The syntax for this is
repeat = 1
self_consistent_field(aluminium_setup_ex2(repeat);
solver=anderson_iteration,
damping=0.8, maxiter=40,
mixing=SimpleMixing());
to choose a damping of $\alpha = 0.8$ and run for at most maxiter
iterations.
(i) Based on this Anderson implementation verify (by making a few experiments) that the algorithm converges for repeat=1
for any $0 < \alpha \leq 2$. You may now use the damping
parameter for changing the value $\alpha$ used by the SCF. State the number of iterations and runtimes you observe.
(ii) Pick $\alpha = 0.8$ and make the problem harder by increasing repeat
(e.g. 2
, 4
, 6
, 8
). Can you make Anderson fail to converge? What do you notice in terms of the number of iterations and runtimes?
(c) Mixing methods. Anderson allows us to push the boundary for SCF methods, but for larger or more challenging systems it is not fully sufficient. The next ingredient for a stable SCF procedure is based on the insight that the convergence properties of an SCF provably depend on the dielectric properties of materials, which is simulated. Amongst others this is to say that insulators (like glass), semiconductors (like silicon) or metals (like aluminium) have rather differing SCF behaviours. As a result the ideal SCF procedure should be slightly different for each material.
The standard approach to include material specificity into the SCF is to employ preconditioned damped fixed-point iterations. To explain the idea, let us consider again a framework without Anderson acceleration. Preconditioned iterations are then $$ \rho_{n+1} = \rho_{n} + \alpha P^{-1} (F(\rho_n) - \rho_n), $$ where $P^{-1}$ is a preconditioner that hopefully improve convergence. To re-introduce Anderson around this iteration just replace the previous definition of $R$ by $R(\rho) = P^{-1} (F(\rho_n) - \rho_n)$.
Finding a good preconditioner $P$ is not always easy and for some cases satisfactory options are not yet known. For our aluminium case, however, we are lucky. The KerkerMixing
implemented in DFTK provides a good $P$ for aluminium.
You might wonder about the term mixing. In an interdisciplinary community like DFT, different scientists use different vocabulary and "mixing" is the physicists' term used for preconditioning.
To use KerkerMixing
with DFTK run the SCF as follows
self_consistent_field(basis; damping=0.8, mixing=KerkerMixing());
Try this setup for different values of repeat
and check the number of iterations needed. Other mixings DFTK has to offer are DielectricMixing
(best for semiconductors), SimpleMixing
(which is $P = I$, i.e. no preconditioner at all, best for insulators) or LdosMixing
(self-adapting, suitable for both metals or insulators or inhomogeneous mixtures). Note that LdosMixing
is the default in DFTK (i.e. used if the mixing
parameter is not supplied to self_consistent_field
. Try these mixings (SimpleMixing
, DielectricMixing
, LdosMixing
and KerkerMixing
) and summarise your findings.
You should notice that choosing a preconditioner matching the material under study aids a fast SCF convergence, but that sometimes being off does not seem to do much harm for our case. For larger values of repeat
(beyond what you can probably effort on your laptop) this is no longer true and one needs to be very careful in selecting the right preconditioner. See for example the investigation in this paper.
The goal of this exercise is to explain the differing convergence behaviour of SCF algorithms depending on the choice of the preconditioner $P^{-1}$ and the underlying material.
For this we will find the largest and smallest eigenvalue of $(P^{-1} \varepsilon^\dagger)$ and $\varepsilon^\dagger$, where $\varepsilon^\dagger$ is the dielectric operator (see 6_Analysing_SCF_convergence.ipynb). The ratio of largest to smallest eigenvalue is the condition number $$ \kappa = \frac{\lambda_\text{max}}{\lambda_\text{min}},$$ which can be related to the rate of convergence. The smaller the condition number, the faster the convergence.
(a) Aluminium metal. We start by taking a look at a slightly cruder (thus computationally cheaper) version of our aluminium setup from above:
using DFTK
using LinearAlgebra
function aluminium_setup_ex3(repeat=1; Ecut=7.0, kgrid=[1, 1, 1])
a = 7.65339
lattice = diagm(fill(a, 3))
Al = ElementPsp(:Al, psp=load_psp("hgh/lda/al-q3"))
atoms = [Al, Al, Al, Al]
positions = [[0.0, 0.0, 0.0], [0.0, 0.5, 0.5], [0.5, 0.0, 0.5], [0.5, 0.5, 0.0]]
# Make supercell
supercell = ase_atoms(lattice, atoms, positions) * (repeat, 1, 1)
lattice = load_lattice(supercell)
positions = load_positions(supercell)
atoms = fill(Al, length(positions))
# Construct the model
model = model_LDA(lattice, atoms, positions, temperature=1e-3, symmetries=false)
PlaneWaveBasis(model; Ecut, kgrid)
end
Additionally we will need the function
function construct_Pinv_epsilon(scfres)
basis = scfres.basis
Pinv_Kerker(δρ) = DFTK.mix_density(KerkerMixing(), basis, δρ)
function epsilon(δρ) # Apply ε† = 1 - χ0 Khxc
δV = apply_kernel(basis, δρ; ρ=scfres.ρ)
χ0δV = apply_χ0(scfres.ham, scfres.ψ, scfres.εF, scfres.eigenvalues, δV)
δρ - χ0δV
end
epsilon, Pinv_Kerker
end
To construct functions representing $\varepsilon^\dagger$ and the Kerker preconditioner $P^{-1}$.
(i) Find the largest eigenvalue of $\varepsilon^\dagger$ for this aluminium case using KrylovKit
. For this use the following code snippet:
using KrylovKit
scfres = self_consistent_field(aluminium_setup_ex3(3); tol=1e-12);
epsilon, Pinv_Kerker = construct_Pinv_epsilon(scfres)
λ_large, X_large, info = eigsolve(epsilon, randn(size(scfres.ρ)), 4, :LM;
tol=1e-4, eager=true, verbosity=2)
@assert info.converged ≥ 4
λ_max = maximum(real.(λ_large))
println("Largest eigenvalue: $(λ_max)")
You can assume the smallest eigenvalue is $λ_{min} = 0.952$. What is the condition number in this case?
(ii) Find the largest eigenvalue for the SCF of the aluminium supercell (repeat=3
) in case the Kerker preconditioner is used.
Hint: You can construct the operator $P^{-1} \varepsilon^\dagger$ by simply chaining the functions (Pinv_Kerker ∘ epsilon
). Assuming that the smallest eigenvalue is about $0.8$, what is the condition number now?
(b) Helium chain (insulator). We will use the following setup:
using DFTK
using LinearAlgebra
function helium_setup(repeat=30; Ecut=7.0, kgrid=[1, 1, 1])
a = 5.0
lattice = a * Matrix(I, 3, 3)
He = ElementPsp(:He, psp=load_psp("hgh/lda/he-q2"))
atoms = [He, ]
positions = [[0.0, 0.0, 0.0], ]
# Make supercell
supercell = ase_atoms(lattice, atoms, positions) * (repeat, 1, 1)
lattice = load_lattice(supercell)
positions = load_positions(supercell)
atoms = fill(He, length(positions))
# Construct the model
model = model_LDA(lattice, atoms, positions, temperature=1e-3, symmetries=false)
PlaneWaveBasis(model; Ecut, kgrid)
end
Repeat the analysis from (a) for a Helium chain with repeat=30
. To find the smallest and largest eigenvalues of $\varepsilon^\dagger$ and $P^{-1} \varepsilon^\dagger$ use
using KrylovKit
scfres = self_consistent_field(helium_setup(30); tol=1e-12);
epsilon, Pinv_Kerker = construct_Pinv_epsilon(scfres)
operator = epsilon
λ_large, X_large, info = eigsolve(operator, randn(size(scfres.ρ)), 2, :LM;
tol=1e-3, eager=true, verbosity=2)
@assert info.converged ≥ 2
λ_max = maximum(real.(λ_large))
λ_small, X_small, info = eigsolve(operator, randn(size(scfres.ρ)), 2, EigSorter(abs, rev=false);
tol=1e-3, eager=true, verbosity=2)
@assert info.converged ≥ 2
λ_min = minimum(real.(λ_small))
println("Smallest eigenvalue: $(λ_min)")
println("Largest eigenvalue: $(λ_max)")
println("Condition number: $(λ_max / λ_min)")
Then run the two SCFs with and without Kerker preconditioning, that is
scfres = self_consistent_field(helium_setup(30); tol=1e-12, mixing=SimpleMixing());
as well as
scfres = self_consistent_field(helium_setup(30); tol=1e-12, mixing=KerkerMixing());
and explain the observations with respect to convergence, taking your findings on the eigenvalues of $\varepsilon^\dagger$ and $P^{-1} \varepsilon^\dagger$ into account.
Only few compounds exhibit a natural permanent magnetism. One of these is iron, while most (like silicon) are not magnetic. This exercise tries to provide a little insight how one could understand, using DFT simulations, why this is the case.
The key assumption in this exercise will be that DFT is a realistic model and that the SCF therefore finds a good approximation to the true ground state of a compound. If this ground state turns out to be magnetic, the compound should therefore feature a permanant magnetism.
We use this computational setup to simulate silicon:
using DFTK
a = 10.26
lattice = a / 2 * [[0 1 1.];
[1 0 1.];
[1 1 0.]]
Si = ElementPsp(:Si, psp=load_psp("hgh/lda/si-q4"))
atoms = [Si, Si]
positions = [-ones(3)/8, ones(3)/8]
# Guess some inital magnetic moments
# (Need to be != 0 otherwise SCF makes the assumption that the ground state is not magnetic
# to gain some performance ...)
magnetic_moments = [2, 2]
model = model_LDA(lattice, atoms, positions; temperature=0.01, magnetic_moments)
basis = PlaneWaveBasis(model; Ecut=13, kgrid=[2, 2, 2]);
ρ0 = guess_density(basis, magnetic_moments)
scfres = self_consistent_field(basis; ρ=ρ0);
scfres.energies.total
Even though we have forced some magnetism into the initial density guess, this magnetisation (indicated by Magnet
) disappears as the SCF converges. Therefore silicon appears to have a non-magnetic ground state.
(i) Repeat the calculation for iron. In this case the system setup is
a = 5.42352 # iron lattice constant in bohr
lattice = a / 2 * [[-1 1 1];
[ 1 -1 1];
[ 1 1 -1]]
Fe = ElementPsp(:Fe, psp=load_psp("hgh/lda/Fe-q8.hgh"))
atoms = [Fe]
positions = [zeros(3)]
magnetic_moments = [3]
otherwise use the same settings as in the silicon calculation. Based on this calculation, what would you conclude?
(ii) As it turns out too small basis sets and Brilouin-zone sampling (too small kgrid
s) are not able to support magnetic ground states. Repeat both the silicon as well as the iron calculations for different values of Ecut
and kgrid
(i.e. [1,1,1]
,[3,3,3]
, [4,4,4]
etc.), where in both cases larger means better. Play with these parameters to determine the first two digits of the ground state energy of silicon and iron. Based on these numerical parameters what do you conclude now?
(iii) To show that a non-magnetic iron structure is actually higher in energy re-run the iron calculation with the Ecut
and kgrid
parameters determined in (ii), but this time set magnetic_moments = [0]
. This enforces the SCF to converge to a non-magnetic ground state. This is why the magnetisation Magnet
is no longer printed ... it is 0
by assumption. What is the energy difference between this non-magnetic iron ground state and the ground state you determined in (ii)?