import numpy as np
import scipy.sparse as sp
import matplotlib.pyplot as plt
from SimPEG import Mesh, Utils, Solver
from scipy.constants import mu_0, epsilon_0
%matplotlib inline
With SimPEG's mesh class, we discretize the sensitivity function for a 1D magnetotelluric problem. Rather than generating the full sensitivity matrix, we compute the forward (Jvec
) and adjoint (Jtvec
) functionals that can evalaute matrix-vector product. There are some milestones to be accomplished:
Compute forward and adjoint sensitivity function: Jvec
and Jtvec
Test Jvec
and Jtvec
: Order test and Adjoint test
We define physical properties at cell centers, and stagger the electric and magnetic fields
and use a finite difference approach to define the operators, this gives us the discrete system of equations
$$ \underbrace{ \begin{bmatrix} \mathbf{Grad} & \imath \omega \mathbf{M}^{f}_{\mu} \\[0.3em] \mathbf{M}^{cc}_{\boldsymbol{\sigma}} & \mathbf{Div} \\[0.3em] \end{bmatrix} }_{\mathbf{A}} \underbrace{ \begin{bmatrix} \mathbf{e_x} \\[0.3em] \mathbf{h_y} \\[0.3em] \end{bmatrix} }_{\mathbf{u}} = \underbrace{ \begin{bmatrix} - \mathbf{B}\mathbf{e_x}^{BC} \\[0.3em] \boldsymbol{0} \\[0.3em] \end{bmatrix} }_{\mathbf{rhs}} $$with
$\mathbf{e_x}$: Discrete $E_x$, on cell centers $[\text{nC} \times 1]$
$\mathbf{h_y}$: Dicrete $H_x$, on cell faces $[(\text{nC}+1) \times 1]$
$ \mathbf{Grad}$: Discrete gradient operator $[\text{nC} \times (\text{nC}+1)]$
$ \mathbf{Div}$: Discrete divergence operator $[(\text{nC}+1) \times \text{nC}]$
$\mathbf{M}^{f}_{\boldsymbol{\mu}} = \mathbf{diag}(\mathbf{Av^{cc2f}} \boldsymbol{\mu})$ $[(\text{nC}+1) \times (\text{nC}+1)]$
$\mathbf{M}^{cc}_{\boldsymbol{\boldsymbol{\sigma}}} = \mathbf{diag}(\boldsymbol{{\sigma}})$ $[\text{nC} \times \text{nC}]$. Here we are using the quasi-static assumption for brevity.
$\mathbf{B} \mathbf{e_x}^{BC}$ handles the boundary conditions
The measured data in general can be defined as:
$$ \mathbf{d} = P(\mathbf{u}) $$where $P(\cdot)$ is a evaluation functional which takes a solution vector $\mathbf{u}$ and ouputs data at a receiver location.
Here, we use impedace data (one could also consider using apparent resistivity and phase). The impedance is complex, so we treat the real and imaginary components of each as two separate data points
$$ Z_{xy} = -\frac{E_x}{H_y} = \text{Re}[Z_{xy}] + i ~\text{Im}[Z_{xy}] $$The impedance $Z_{xy}$ can be evaluated from the solution vector $\mathbf{u}$. We will evaluate data at $z=0m$. The solution vector we obtain from the forward simulation is:
$$ \mathbf{u} = \begin{bmatrix} \mathbf{e_x} \\[0.3em] \mathbf{h_y} \\[0.3em] \end{bmatrix} $$At the surface, we specified the boundary condition that $E_x(z=0) = 1$. So what we need $P(\dot)$ to accomplish is $$ Z_{xy}\big|_{z=0} = -\frac{1}{h_y(z=0)} $$
Thus, $P(\dot)$ can be defined as an interpolation matrix that simply extracts the value of $h_y$ at the surface. We denote this matrix: $\mathbf{P}_{0}$ (Thinking in terms of matrices is very helpful when we get to the step of taking derivatives!)
$$\mathbf{d} = Z_{xy} = - \mathbf{P}_{0}\left(\frac{1}{\mathbf{u}}\right) $$From complex-values $Z_{xy}$, we can compute real and imagniary part of the $Z_{xy}$ then the data can be defined as:
$$ \mathbf{d} = \begin{bmatrix} \text{Re}[Z_{xy}] \\[0.3em] \text{Im}[Z_{xy}] \\[0.3em] \end{bmatrix} $$We will set up an example and go through the steps to compute a datum.
def skin_depth(sigma, f):
"""
Depth at which the fields propagating through a homogeneous medium
have decayed by a factor of 1/e for a given frequency, f and conductivity, sigma
"""
return 500./np.sqrt(sigma * f)
rho_half = 100. # Resistivity of the halfspace in Ohm-m
sigma_half = 1./rho_half # Conductivity is the inverse of conductivity
frequency = np.logspace(-3, 2, 25) # frequencies at which to simulate the MT problem
skin_depth_min = skin_depth(sigma_half, frequency.max())
skin_depth_max = skin_depth(sigma_half, frequency.min())
print("The minimum skin depth is {:1.2f}m".format(skin_depth_min))
print("The maximum skin depth is {:1.2e}m".format(skin_depth_max))
The minimum skin depth is 500.00m The maximum skin depth is 1.58e+05m
cs = skin_depth_min / 4.
core_extent = 5000.
domain_extent = 2 * skin_depth_max
print("The smallest cell size is {:1.2f}m".format(cs))
print("The core region of the mesh extends {:1.2e}m".format(core_extent))
print("The mesh should extend at least {:1.2e}m".format(domain_extent))
The smallest cell size is 125.00m The core region of the mesh extends 5.00e+03m The mesh should extend at least 3.16e+05m
Add padding to extend sufficiently far
npad = 1 # start with 1 cell
padding_fact = 1.3 # the amount by which we will expand each cell of the padding
def padding_extent(npad):
"""
given a number of padding cells, this computes how far the padding extends
"""
padding_widths = cs*padding_fact**(np.arange(npad) + 1)
return padding_widths.sum()
# keep adding padding until we are beyond the desired extent
padding_z = padding_extent(npad)
while padding_z < domain_extent:
npad+=1
padding_z = padding_extent(npad)
print("{:1.0f} padding cells extends {:1.2e}m > {:1.2e}m (2 skin depths)".format(
npad, padding_extent(npad), domain_extent
))
25 padding cells extends 3.82e+05m > 3.16e+05m (2 skin depths)
ncz = np.ceil(core_extent / cs) # number of cells in the core domain
hz = [(cs, npad, -1.3), (cs, ncz)] # define how to construct the cell widths
mesh = Mesh.TensorMesh([hz], x0='N') # construct a 1D Tensor Mesh
print("There are {:1.0f} cells in the mesh. The mest extends {:1.2e}m".format(
ncz, mesh.hx.sum()
))
sigma = np.ones(mesh.nC) * sigma_half
There are 40 cells in the mesh. The mest extends 3.87e+05m
Forward simulation function dpred
takes conductivity model (nCx1
), and frequency (1x1
), and outputs real and imaginary part of the impedance, $Z_{xy}$ (2x1
). By solving $\mathbf{A} \mathbf{u}=\mathbf{rhs}$, we compute solution vector $\mathbf{u}$, then evaluate real and imaginary part of impedance at $z=0$ ($\mathbf{d} = P(\mathbf{u})$).
# Projection Matrix
P0 = sp.csr_matrix(
(np.r_[1.], (np.r_[0], np.r_[mesh.nF+mesh.nC-1])), shape=(1, mesh.nF+mesh.nC)
)
print(
"The projection matrix has shape {} with {} non-zero entry at ({},{})".format(
P0.shape, P0.nnz, P0.nonzero()[0][0], P0.nonzero()[1][0]
)
)
The projection matrix has shape (1, 131) with 1 non-zero entry at (0,130)
def dpred(sigma, f=100.):
# angular frequency
omega = f*2.*np.pi
# physical properties
mu = np.ones(mesh.nC)*mu_0 # magnetic permeability values for all cells
sigmahat = sigma.copy() # here we are ignoring displacement current
# Grad
mesh.setCellGradBC([['dirichlet', 'dirichlet']]) # Setup boundary conditions
Grad = mesh.cellGrad # Gradient matrix
# MfMu
Mmu = Utils.sdiag(mesh.aveCC2F * mu)
# Mccsigma
Msighat = Utils.sdiag(sigmahat)
# Div
Div = mesh.faceDiv # Divergence matrix
# Right Hand Side
B = mesh.cellGradBC # a matrix for boundary conditions
Exbc = np.r_[0., 1.] # boundary values for Ex
# A-matrix
A = sp.vstack([
sp.hstack([Grad, 1j*omega*Mmu]), # Top row of A matrix
sp.hstack((Msighat, Div)) # Bottom row of A matrix
])
# Right-hand side
rhs = np.r_[
-B*Exbc,
np.zeros(mesh.nC)
]
Ainv = Solver(A) # Factor A matrix
u = Ainv*rhs # Solve A^-1 rhs = u
# build the projection matrix, a sparse matrix that has 1 entry to grab the value of h_y at the surface
P0 = sp.csr_matrix(
(np.r_[1.], (np.r_[0], np.r_[len(u)-1])), shape=(1, len(u))
)
Zxy = - P0 * (1./(u))
return np.r_[Zxy.real, Zxy.imag]
f = 100.
data = dpred(sigma, f=f)
print("At f={}Hz, we have two data. \nRe[Z] = {:1.3f}, Im[Z] = {:1.3f}".format(f, data[0], data[1]))
At f=100.0Hz, we have two data. Re[Z] = 0.196, Im[Z] = 0.202
The sensitivity function shows how much the data are changed due to changes in model paramters. Understanding how "sensitive" our data are to the model is important for survey design. It is essential that we be able to compute the sensitivity when using gradient-based optimization techniques, as the sensitivity gives us the direction in which to take a step and update the model as we search for a solution.
The sensitivity function can be defined as
$$ J_{\sigma} = \frac{d P(u)}{d \sigma}$$The size of the sensitivity is $[nD \times n\sigma]$
To obtain above sensitivity function in discrete space, we first differentiate
$$\mathbf{A}\mathbf{u} = \mathbf{rhs}$$w.r.t ${\boldsymbol{\sigma}}$, which yields
$$ \frac{d \mathbf{A}}{d {\boldsymbol{\sigma}}}\mathbf{u} + \mathbf{A} \frac{d \mathbf{u} }{d {\boldsymbol{\sigma}}}= 0 $$Rearranging and multiplyting by $\mathbf{A}^{-1}$ on both sides yields
$$ \frac{d \mathbf{u} }{d {\boldsymbol{\sigma}}}= -\mathbf{A}^{-1}\frac{d \mathbf{A}}{d {\boldsymbol{\sigma}}}\mathbf{u} $$Next, we need to include the evaluation, $\mathbf{d} = P(\mathbf{u})$ which is the operation taken on $\mathbf{u}$ to give us the data. From this, we obtain
$$ \mathbf{J}_{{\boldsymbol{\sigma}}} = \frac{\partial P(\mathbf{u})}{\partial {\mathbf{u}}}\Big(\frac{d \mathbf{u} }{d {\boldsymbol{\sigma}}}\Big) = -\frac{\partial P(\mathbf{u})}{\partial {\mathbf{u}}} \Big(\mathbf{A}^{-1}\frac{d \mathbf{A}}{d {\boldsymbol{\sigma}}}\mathbf{u}\Big) $$From this, there are two derivatives need to be computed:
For $\frac{d \mathbf{A}}{d \boldsymbol{\sigma}}\mathbf{u}$, keep in mind that we are treating $\mathbf{u}$ as fixed and that
$$ \mathbf{A}\mathbf{u} = \begin{bmatrix} \mathbf{Grad} & \imath \omega \mathbf{M}^{f}_{\mu} \\[0.3em] \mathbf{M}^{cc}_{\boldsymbol{\sigma}} & \mathbf{Div} \\[0.3em] \end{bmatrix} \begin{bmatrix} \mathbf{e_x} \\[0.3em] \mathbf{h_y} \\[0.3em] \end{bmatrix} $$Here, we see that the only dependence on $\boldsymbol{\sigma}$ is in the matrix $\mathbf{M}^{cc}_{\hat{\boldsymbol{\sigma}}} = \mathbf{diag}(\boldsymbol{\sigma})$. So lets focus our attention on that block. We are taking the derivative of a Matrix-vector product (which is just a vector) with respect to the vector $\boldsymbol{\sigma}$, so the result should be a matrix. We write out the problem:
$$ \frac{\partial}{\partial \boldsymbol{\sigma}} \mathbf{M}^{cc}_{\hat{\boldsymbol{\sigma}}} \mathbf{e_x}^{fix} = \frac{\partial}{\partial \boldsymbol{\sigma}} \mathbf{diag}(\boldsymbol{\sigma}) \mathbf{e_x}^{fix} $$and since $\text{diag}(\mathbf{x})\mathbf{y} = \mathbf{diag}(\mathbf{y})\mathbf{x}$, we can interchange the roles of $\boldsymbol{\sigma}$ and $\mathbf{e_x}^{fix}$
$$ \frac{\partial}{\partial \boldsymbol{\sigma}} \mathbf{M}^{cc}_{\hat{\boldsymbol{\sigma}}} \mathbf{e_x}^{fix} = \frac{\partial}{\partial \boldsymbol{\sigma}} \mathbf{diag}(\mathbf{e_x}^{fix}) \boldsymbol{\sigma} $$So the derivative is simply: $$ \frac{\partial}{\partial \boldsymbol{\sigma}} \mathbf{M}^{cc}_{\hat{\boldsymbol{\sigma}}} \mathbf{e_x}^{fix} =\text{diag}(\mathbf{e_x}^{fix}) $$
Thus the full derivative is $$ \frac{d \mathbf{A}}{d \boldsymbol{\sigma}}\mathbf{u} = \begin{bmatrix} \mathbf{0} \\[0.3em] \mathbf{diag}(\mathbf{e}_x) \\[0.3em] \end{bmatrix} $$
For the other one we consider when the data is defined as real and imaginary parts of $Z_{xy} = -\mathbf{P}_0\frac{1}{\mathbf{u}}$:
Taking derivative of $Z_{xy}$ w.r.t. $\mathbf{u}$ yields $$ \frac{\partial Z_{ xy}}{\partial \mathbf{u}}= \mathbf{P}_0\frac{1}{\mathbf{u}^2} $$
$$ \frac{\partial P(\mathbf{u})}{\partial \mathbf{u}}= \begin{bmatrix} \frac{\partial Re[Z_{xy}]}{\partial \mathbf{u}} \\[0.3em] \frac{\partial Im[Z_{xy}]}{\partial \mathbf{u}} \\[0.3em] \end{bmatrix} = \begin{bmatrix} Re[\mathbf{P}_0\frac{1}{\mathbf{u}^2}] \\[0.3em] Im[\mathbf{P}_0\frac{1}{\mathbf{u}^2}] \\[0.3em] \end{bmatrix} $$Now we can form sensitivity matrix $\mathbf{J}_{\sigma}$ by combining above equations:
$$ \mathbf{J}_{{\boldsymbol{\sigma}}} = -\frac{\partial P(\mathbf{u})}{\partial {\mathbf{u}}} \Big(\mathbf{A}^{-1}\frac{d \mathbf{A}}{d {\boldsymbol{\sigma}}}\mathbf{u}\Big) $$Deriving sensitivity for apparent resistivity and phase is possible, but this requires additional details hence, we focus on our attention to real and imaginary parts of impedance.
We compute discretized sensitivity matrix, $\mathbf{J}_{\sigma}$ shown above.
# angular frequency
omega = f*2*np.pi
# physical properties
sigma = np.ones(mesh.nC)*sigma_half # conductivity values for all cells
mu = np.ones(mesh.nC)*mu_0 # magnetic permeability values for all cells
sigmahat = sigma.copy()
# Grad
mesh.setCellGradBC([['dirichlet', 'dirichlet']]) # Setup boundary conditions
Grad = mesh.cellGrad # Gradient matrix
# MfMu
Mmu = Utils.sdiag(mesh.aveCC2F * mu)
# Mccsigma
Msighat = Utils.sdiag(sigmahat)
# Div
Div = mesh.faceDiv # Divergence matrix
# Right Hand Side
B = mesh.cellGradBC # a matrix for boundary conditions
Exbc = np.r_[0., 1.] # boundary values for Ex
# A-matrix
A = sp.vstack([
sp.hstack([Grad, 1j*omega*Mmu]), # Top row of A matrix
sp.hstack((Msighat, Div)) # Bottom row of A matrix
])
# Right-hand side
rhs = np.r_[
-B*Exbc,
np.zeros(mesh.nC)
]
Ainv = Solver(A) # Factorize A matrix
u = Ainv*rhs # Solve A^-1 rhs = u
Ex = u[:mesh.nC] # Extract Ex from uution vector u
Hy = u[mesh.nC:mesh.nC+mesh.nN] # Extract Hy from solution vector u
P0 = sp.csr_matrix(
(np.r_[1.], (np.r_[0], np.r_[len(u)-1])), shape=(1, len(u))
)
P0 = P0.tocsr()
dAdsig_u = sp.vstack((Utils.spzeros(int(mesh.nC+1), mesh.nC), Utils.sdiag(Ex)))
dudsig = - (Ainv * (dAdsig_u.toarray()))
dZdsig = P0 * (Utils.sdiag(1./(u**2)) * dudsig)
J = np.vstack((dZdsig.real, dZdsig.imag))
fig, ax = plt.subplots(1, 1)
core_inds = mesh.vectorCCx > -core_extent
ax.loglog(-mesh.vectorCCx[core_inds], abs(J[0, core_inds]), label="real")
ax.loglog(-mesh.vectorCCx[core_inds], abs(J[1, core_inds]), label="imag")
ax.grid(True)
ax.legend()
ax.set_xlabel("Logarithmic Depth (m)")
ax.set_ylabel("Sensitivity")
Text(0,0.5,'Sensitivity')
For the 1D MT problem, the sensitivity matrix ($N\times M$) is small, hence generating sensitivity is not be a big deal. However, for any 3D EM problem, generating the sensitivity matrix will require a huge amount of memory. To minimize that we only compute sensitivity-vector product. In forward case we compute:
$$ \mathbf{Jv} = \mathbf{J}_{\boldsymbol{\sigma}} \mathbf{v} $$Similarly, in adjoint case, we compute
$$ \mathbf{Jtv} = \mathbf{J}_{\boldsymbol{\sigma}}^{T} \mathbf{v} $$Computing $\mathbf{Jv}$ and $\mathbf{Jtv}$ are straight forward from above derivation.
$$ \mathbf{J}_{{\boldsymbol{\sigma}}}^T \mathbf{v} = - \left(\frac{d \mathbf{A}}{d {\boldsymbol{\sigma}}}\mathbf{u} \right)^T \left(\mathbf{A}^{T}\right)^{-1} \frac{\partial P(\mathbf{u})}{\partial {\mathbf{u}}}^T \mathbf{v} $$One function computes forward sensitivity-vector product as Jvec
and the other function computes adjoint sensitivity-vector product are named as Jtvec
.
Jvec
function takes conductivity ($\sigma$) and vector ($\mathbf{v}$), and computes sensitivity-vector product at a given frequency.
def Jvec(sigma, v, f=100.):
mu = np.ones(mesh.nC)*mu_0 # magnetic permeability values for all cells
epsilon = np.ones(mesh.nC)*epsilon_0 # dielectric constant values for all cells
omega = 2*np.pi*f # Angular frequency (rad/s)
sigmahat = sigma # Assume sigmahat = sigma
Div = mesh.faceDiv # Divergence matrix
mesh.setCellGradBC([['dirichlet', 'dirichlet']]) # Setup boundary conditions
Grad = mesh.cellGrad # Gradient matrix
B = mesh.cellGradBC # a matrix for boundary conditions
Exbc = np.r_[0., 1.] # boundary values for Ex
Msighat = Utils.sdiag(sigmahat)
Mmu = Utils.sdiag(mesh.aveCC2F * mu)
# A-matrix
A = sp.vstack([
sp.hstack([Grad, 1j*omega*Mmu]), # Top row of A matrix
sp.hstack((Msighat, Div)) # Bottom row of A matrix
])
# Right-hand side
rhs = np.r_[
-B*Exbc,
np.zeros(mesh.nC)
]
Ainv = Solver(A) # Factorize A matrix
u = Ainv*rhs # Solve A^-1 rhs = u
Ex = u[:mesh.nC] # Extract Ex from uution vector u
Hy = u[mesh.nC:mesh.nC+mesh.nN] # Extract Hy from solution vector u
P0 = sp.csr_matrix(
(np.r_[1.], (np.r_[0], np.r_[len(u)-1])), shape=(1, len(u))
)
P0 = P0.tocsr()
Zxy = - 1./(P0*u)
dAdsig_u_v = np.r_[np.zeros_like(Hy), Utils.sdiag(Ex)*v]
dudsig_v = - (Ainv * (dAdsig_u_v))
dZdsig_v = P0 * (Utils.sdiag(1./(u**2)) * dudsig_v)
dZrdsig_v = dZdsig_v.real
dZidsig_v = dZdsig_v.imag
return np.r_[dZrdsig_v, dZidsig_v]
We have written the Jvec
function, but how do we make sure this function is working properly?
Let's consdier a predicted data $ d = \mathcal{F}[\sigma + \triangle \sigma] $. Applying Taylor's expansion yields
$$ \mathcal{F}[\sigma + \triangle \sigma] = \mathcal{F}[\sigma] +\frac{d \mathcal{F}}{d \sigma} \triangle \sigma + \mathcal{O}(\triangle \sigma )^2 $$By rearranging aboe equation, we can consider two misfit functions:
$$ f^1 = \| \mathcal{F}[\sigma + \triangle \sigma] -\mathcal{F}[\sigma] \| $$$$ f^2 = \| \mathcal{F}[\sigma + \triangle \sigma] -\mathcal{F}[\sigma] -\frac{d \mathcal{F}}{d \sigma} \triangle \sigma \| $$The first misfit function is supposed to have 1st order accuracy, but the other should be 2nd order accuracy. By using SimPEG
's Tests
class we compute this two misfits, and check the accuracy.
from SimPEG import Tests
def derChk(m):
return [dpred(m), lambda mx: Jvec(m, mx)]
Tests.checkDerivative(derChk, sigma, plotIt=False, num=3, eps=1e-20, dx=sigma*3)
==================== checkDerivative ==================== iter h |ft-f0| |ft-f0-h*J0*dx| Order --------------------------------------------------------- 0 1.00e-01 3.454e-02 7.604e-03 nan 1 1.00e-02 4.121e-03 9.254e-05 1.915 2 1.00e-03 4.204e-04 9.461e-07 1.990 ========================= PASS! ========================= Testing is important.
True
The below function takes conductivity ($\sigma$) and vector ($\mathbf{v}$), and computes adjoint sensitivity-vector product (Jtvec
) at a given frequency.
def misfit(sigma, dobs=None):
r = dpred(sigma) - dobs
return 0.5 * np.linalg.norm(r)**2
def Jtvec(sigma, v, dtype="ri"):
f = 100.
mu = np.ones(mesh.nC)*mu_0 # magnetic permeability values for all cells
epsilon = np.ones(mesh.nC)*epsilon_0 # dielectric constant values for all cells
omega = 2*np.pi*f # Angular frequency (rad/s)
sigmahat = sigma # Assume sigmahat = sigma
Div = mesh.faceDiv # Divergence matrix
mesh.setCellGradBC([['dirichlet', 'dirichlet']]) # Setup boundary conditions
Grad = mesh.cellGrad # Gradient matrix
B = mesh.cellGradBC # a matrix for boundary conditions
Exbc = np.r_[0., 1.] # boundary values for Ex
Msighat = Utils.sdiag(sigmahat)
Mmu = Utils.sdiag(mesh.aveCC2F * mu)
tempUp = sp.hstack((Grad, 1j*omega*Mmu)) # Top row of A matrix
tempDw = sp.hstack((Msighat, Div)) # Bottom row of A matrix
A = sp.vstack((tempUp, tempDw)) # Full A matrix
rhs = np.r_[-B*Exbc, np.zeros(mesh.nC)] # Right-hand side
Ainv = Solver(A) # Factorize A matrix
u = Ainv*rhs # Solve A^-1 rhs = u
Ex = u[:mesh.nC] # Extract Ex from uution vector u
Hy = u[mesh.nC:mesh.nC+mesh.nN] # Extract Hy from solution vector u
P0 = sp.coo_matrix(
(np.r_[1.], (np.r_[0], np.r_[len(u)-1])), shape=(1, len(u))
)
P0 = P0.tocsr()
Zxy = - 1./(P0*u)
ATinv = Solver(A.T) # Factorize A matrix
PTvr = (P0.T*np.r_[v[0]]).astype(complex)
PTvi = P0.T*np.r_[v[1]]*-1j
dZrduT_v = Utils.sdiag((1./(u**2)))*PTvr
dZiduT_v = Utils.sdiag((1./(u**2)))*PTvi
dAdsiguT = sp.hstack((Utils.spzeros(mesh.nC, mesh.nN), Utils.sdiag(Ex)))
dZrdsigT_v = - (dAdsiguT*(ATinv*dZrduT_v)).real
dZidsigT_v = - (dAdsiguT*(ATinv*dZiduT_v)).real
return dZrdsigT_v + dZidsigT_v
Similarly, Jtvec
function has to be tested. For this, in this turn we consider a data msifit function:
where residual is $\mathbf{r} = \mathcal{F}[\sigma] - \mathbf{d}^{obs}$.
By taking derivative w.r.t $\sigma$, we obtain
$$ \frac{d \phi_d}{d \sigma} = \mathbf{J}_{\sigma}^T \mathbf{r} $$Jvec
is required.Then applying taylor expansion to $\phi_d$ yields
$$ \phi_d[\sigma + \triangle \sigma] = \phi_d[\sigma] +\frac{d \phi_d}{d \sigma} \triangle \sigma + \mathcal{O}(\triangle \sigma )^2 $$And similarly, we can consider two misfit functions:
$$ \phi_d^1 = \| \phi_d[\sigma + \triangle \sigma] -\phi_d[\sigma] \| $$$$ \phi_d^2 = \| \phi_d[\sigma + \triangle \sigma] -\phi_d[\sigma] -\frac{d \phi_d}{d \sigma} \triangle \sigma \| $$The first data misfit function is supposed to have 1st order accuracy, but the other should be 2nd order accuracy. By using SimPEG
's Tests
class we compute this two misfits, and check the accuracy.
sigma0 = sigma*3
dobs_ri = dpred(sigma)
r = dpred(sigma0) - dobs_ri
Tests.checkDerivative(
lambda m: [misfit(m, dobs=dobs_ri), Jtvec(m, r)],
sigma0,
plotIt=False,
num=5
)
==================== checkDerivative ==================== iter h |ft-f0| |ft-f0-h*J0*dx| Order --------------------------------------------------------- 0 1.00e-01 4.341e-02 6.094e-02 nan 1 1.00e-02 1.733e-03 1.981e-05 3.488 2 1.00e-03 1.754e-04 8.632e-08 2.361 3 1.00e-04 1.753e-05 1.094e-09 1.897 4 1.00e-05 1.753e-06 1.116e-11 1.991 ========================= PASS! ========================= Testing is important.
True
Both Jvec
and Jtvec
functions have passed the order test. These tests are necessary, but not sufficient.
To test that both Jvec
and Jtvec
are consistent, we perform adjoint test. Consdier two random vectors: $\mathbf{w}$ and $\mathbf{v}$, then we can evalaute
which will be a scalar value. Adjoint of above proucts is
$$ \mathbf{v}^T \mathbf{J}_{\sigma}^T \mathbf{w}, $$They should have same value: $\mathbf{w}^T \mathbf{J}_{\sigma} \mathbf{v}=\mathbf{v}^T \mathbf{J}_{\sigma}^T \mathbf{w}$. We evaluate $\mathbf{w}^T \mathbf{J}_{\sigma} \mathbf{v}$ and $\mathbf{v}^T \mathbf{J}_{\sigma}^T \mathbf{w}$ using Jvec
and Jtvec
, respectively, and check if they are outputing same values.
v = np.random.rand(mesh.nC)
w = np.random.rand(dobs_ri.shape[0])
wtJv = w.dot(Jvec(sigma0, v))
vtJtw = v.dot(Jtvec(sigma0, w))
passed = np.abs(wtJv - vtJtw) < 1e-10
print('Adjoint Test', np.abs(wtJv - vtJtw), passed)
Adjoint Test 1.1102230246251565e-16 True