Inversion of Time Domain EM data: Bookpurnong Australia

This example is based on the inversion published of SkyTEM data over Bookpurnong in Heagy et al. (2017)

In [1]:
# core python packages
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import ipywidgets
from scipy.constants import mu_0

# SimPEG and related packages
from pymatsolver import Pardiso as Solver
from SimPEG import (
    Mesh, Maps, Utils, DataMisfit, Regularization, Optimization, 
    Inversion, InvProblem, Directives
)
from SimPEG.EM import TDEM
In [2]:
# larger font size for plots
from matplotlib import rcParams
rcParams['font.size']=14

Look at the contents of the data directory we are given

In [3]:
data_directory = os.path.sep.join(["..", "data", "bookpurnong"])
os.listdir(data_directory)
Out[3]:
['MurrayRiver.txt',
 'booky_resolve.hdf5',
 '8044_Bookpurnong.HDR',
 'SK655CS_Bookpurnong_ZX_HM_TxInc_newDTM.txt',
 'booky_skytem.hdf5',
 'README.txt',
 'Bookpurnong_SkyTEM.HDR',
 'Bookpurnong_Resolve_Exported.XYZ']

look at the README for a description of the files

In [4]:
# the os.path.sep.join combines the "bookpurnong" and "README.txt" with the 
# correct path seperator (e.g. on mac or linux, this will produce 
# "bookpurnong/README.txt")

with open(os.path.sep.join([data_directory, "README.txt"]), 'r') as file:
    print(file.read())
Bookpurnong Data Sets
=====================

The RESOLVE and SkyTEM data collected over Bookpurnong have been made available with permission from CSIRO. Please acknowledge CSIRO if using these data in a presentation, publication, etc.

Two data sets are included in this distribution, RESOLVE data collected in 2008, and SkyTEM (High Moment) data collected in 2006.

For an example of how to load and plot the data, please see: http://docs.simpeg.xyz


Contents
--------

- 8044_Bookpurnong.HDR : RESOLVE header file for the 2008 Bookpurnong survey
- Bookpurnong_Resolve_Exported.XYZ : RESOLVE data collected in 2008
- Bookpurnong_SkyTEM.HDR : SkyTEM header file for the 2006 Bookpurnong survey
- SK655CS_Bookpurnong_ZX_HM_TxInc_newDTM.txt : SkyTEM high moment data collected in 2006




Load the SkyTEM data

Here, we use the pandas library to read in the data. Pandas is good for working with tabular data, particularly when those data might not all be of the same type. For example, in the SkyTEM data set, there are dates, numeric values, times, ... These don't nicely fit into a numpy array, but do fit nicely into a pandas data frame.

In [5]:
# Load SkyTEM 2005
skytem = pd.read_table(
    os.path.sep.join([data_directory, "SK655CS_Bookpurnong_ZX_HM_TxInc_newDTM.txt"])
)

# list(skytem)  # list the data header
In [6]:
# pull out the flight lines
lines = np.unique(skytem.LINE)
In [7]:
# Load path of Murray River
river_path = np.loadtxt(os.path.sep.join([data_directory, "MurrayRiver.txt"]))
In [8]:
def plot_line(line_number, ax=None):
    """
    A plotting function that will plot all sounding locations for a given `line_number`
    """
    if ax is None:
        fig, ax = plt.subplots(1, 1)
    inds = skytem.LINE == line_number
    ax.plot(skytem.E[inds], skytem.N[inds], '.', ms=0.6)
    return ax
In [9]:
# plot all of the flight lines
fig, ax = plt.subplots(1, 1)
for l in lines:
    plot_line(l, ax=ax)

# ax.plot(river_path[:, 0], river_path[:, 1], 'k', lw=0.5)

Find and omit the tie lines and lines along river.

Here, I use a small widget to plot line by line to see which lines are the ones that are along the river or are tie lines.

In [10]:
ipywidgets.interactive(
    lambda ind: plot_line(lines[ind]), 
    ind=ipywidgets.IntSlider(min=0, max=len(lines)-1, value=0),
)

# ax.plot(river_path[:, 0], river_path[:, 1], 'k', lw=0.5)
In [11]:
tie_line_inds = (skytem.LINE >= lines[29])

fig, ax = plt.subplots(1, 1)
for l in lines[:29]:
    plot_line(l, ax=ax)

Pare down the data set for inversion

  • invert only the z-oriented data
  • ignore tie-lines and lines along the river
In [12]:
data_inds = []

for i, head in enumerate(list(skytem)):
    if head.startswith("Z"):
        data_inds.append(i)
In [13]:
easting = skytem.E[~tie_line_inds].values
northing = skytem.N[~tie_line_inds].values
elevation = skytem.LASALT[~tie_line_inds].values
data = skytem.iloc[np.where(~tie_line_inds)[0], data_inds].values

time channels where the data are sampled

  • the below values are copied from the skytem header file for the high moment data
In [14]:
time_channels = np.vstack([
    [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
    [470e-7, 598e-7, 726e-7, 886e-7, 1118e-7, 1454e-7, 1852e-7, 2344e-7, 2952e-7, 3706e-7, 4644e-7, 5814e-7, 7278e-7, 9112e-7, 11170e-7, 14292e-7, 17912e-7, 22460e-7, 28174e-7, 35356e-7, 44388e-7, 55750e-7, 7.00e-03, 8.80e-03]
]).T
In [15]:
def plot_data(time_ind):
    """
    Given a time index, plot_data will plot the skytem data across 
    the survey area at that time
    """
    fig, ax = plt.subplots(1,1, figsize = (8,8))
    
    # grid the data
#     nskip = 
    out = Utils.plot2Ddata(np.vstack([easting, northing]).T, data[:, time_ind], ncontour=100, ax=ax)
    plt.colorbar(out[0], ax=ax, label="db/dt")
    
    # add the river path 
    ax.plot(river_path[:, 0], river_path[:, 1], 'k', lw=0.5)
    
    # labels
    ax.set_xlabel('easting (m)')
    ax.set_ylabel('northing (m)')
    
    # get title    
    ax.set_title(f"time channel {time_channels[time_ind, 0]}, t = {time_channels[time_ind, 1]}s")
    return ax
In [16]:
ipywidgets.interact(
    plot_data, 
    time_ind = ipywidgets.IntSlider(min=0, max=time_channels.shape[0]-1, value=0)
)
Out[16]:
<function __main__.plot_data(time_ind)>

invert a single sounding

Here, we will invert a single sounding location for a layered earth and use a cylindrically symmetric mesh for the forward modelling.

In [17]:
xloc, yloc = 462100.0, 6196500.0
rxind = np.argmin((easting-xloc)**2+(northing-yloc)**2)
In [18]:
# plot the location
ax = plot_data(20)
ax.plot(easting[rxind], northing[rxind], 'ro')
Out[18]:
[<matplotlib.lines.Line2D at 0x1135c5e10>]
In [19]:
fig, ax = plt.subplots(1, 1)

ax.loglog(time_channels[:, 1], data[rxind, :], 'o')
ax.set_xlabel("time (s)")
ax.set_ylabel("db$_z$ / dt (V / Am$^4$)")
ax.grid("k", alpha=0.5)

set up the forward simulation

Set up a mesh

  • here, we use a cylindrically symmetric mesh to perform the 1D inversion
  • we make sure that the forward simulation mesh extends beyond the diffusion distance of the latest time channel
$$ z_{max} = \sqrt{\frac{2 t}{\mu\sigma}} \approx 1260 \sqrt{\frac{ t}{\sigma}} $$
  • for more details on setting up a cylindrically symmetric mesh, see the docs
In [20]:
diffusion_distance = 1260 * np.sqrt(1e-2 / 1e-1)
print(diffusion_distance)
398.4469851812158
In [21]:
# cell size, number of cells in the x-direction, 
# number of cells in the z-direction and number of padding cells
cs, ncx, ncz, npad = 1., 10., 10., 20
hx = [(cs, ncx), (cs, npad, 1.3)]
npad = 12

log_spaced_z_cells = np.logspace(np.log10(1.), np.log10(12.), 19)
z_padding = log_spaced_z_cells[-1] * 1.3 ** np.arange(npad)
hz = np.r_[z_padding[::-1], log_spaced_z_cells[::-1], log_spaced_z_cells, z_padding]
mesh = Mesh.CylMesh([hx, 1, hz], '00C')
active = mesh.vectorCCz < 0.
In [22]:
mesh.plotGrid()
Out[22]:
<matplotlib.axes._subplots.AxesSubplot at 0x12bac7908>

set up mappings and a starting model

Mappings are used in SimPEG to translate the set of parameters that we invert for to electrical conductivity values on the forward simulation mesh.

Here, the inversion model is 1D log-conductivity below the surface. So we use an InjectActiveCells map to include the air cells, the SurjectVertical1D map to take the 1D model and put it on the cylindrically symmetric mesh, and the ExpMap to take the exponential of the log-conductivity values.

In [23]:
sig_half = 1e-1
sig_air = 1e-8
active_inds = mesh.vectorCCz < 0.
active_map = Maps.InjectActiveCells(mesh, active_inds, np.log(sig_air), nC=mesh.nCz)
mapping = Maps.ExpMap(mesh) * Maps.SurjectVertical1D(mesh) * active_map
In [24]:
m0 = np.ones(active_inds.sum())*np.log(sig_half)
In [25]:
## plot the initial model

plt.colorbar(mesh.plotImage(np.log10(mapping*m0))[0])
Out[25]:
<matplotlib.colorbar.Colorbar at 0x109efec18>

source waveform

  • the below values were copied from the skytem header file for the high moment data
  • we will need a finer discretization for the simulation, so we approximate it by the VTEM source function, which is very similar.
In [26]:
waveform = np.vstack([
    np.r_[-10, -9.29, -8.41, -7.26, -5.28, -3.62, -2.33, -0.62, 0.00, 0.0266, 0.0276, 0.0286, 10.000]*1e-3,
    np.r_[0, 20, 40, 60, 80, 90, 95, 99, 100, 1.53, 0.566, 0.000, 0.000]/100.
]).T
In [27]:
t0 = -1*waveform[0, 0]
src_waveform = TDEM.Src.VTEMWaveform(a=3., peakTime=t0, offTime=t0+29e-6)
In [28]:
fig, ax = plt.subplots(1, 1)

ax.plot(waveform[:, 0]+t0, waveform[:, 1], 'o')
ax.plot(waveform[:, 0]+t0, [src_waveform.eval(t) for t in waveform[:, 0]+t0])
ax.set_xlabel("time (s)")
ax.set_ylabel("Normalized current")
ax.grid("k", alpha=0.5)
In [29]:
# loop parameters
area = 313.98
radius = np.sqrt(area/np.pi)
In [30]:
# Bird height from the surface
system_height = elevation[rxind]

# The data are secondary field data
dbdtz = TDEM.Rx.Point_dbdt(
    np.array([[radius, 0., system_height]]),
    orientation='z',
    times=time_channels[:-3, 1] + t0
)
In [31]:
# source
src_list = [
    TDEM.Src.CircularLoop(
        [dbdtz], loc=np.r_[0., 0., system_height], radius=radius,
        orientation='z',
        waveform=src_waveform, 
        current=1./area  # the data are normalized by loop area
    )
]
In [32]:
# solve the problem at these times
timeSteps = [
    (src_waveform.peakTime/5, 5), ((src_waveform.offTime-src_waveform.peakTime)/5, 5),
    (1e-5, 5), (5e-5, 5), (1e-4, 10), (5e-4, 15)
]
problem = TDEM.Problem3D_e(
    mesh, timeSteps=timeSteps, sigmaMap=mapping, Solver=Solver
)
survey = TDEM.Survey(src_list)
problem.pair(survey)
In [33]:
fig, ax = plt.subplots(1, 1)

ax.plot(waveform[:, 0]+t0, waveform[:, 1], 'o')
ax.plot(waveform[:, 0]+t0, [src_waveform.eval(t) for t in waveform[:, 0]+t0])
ax.plot(problem.timeMesh.vectorNx, np.zeros(problem.timeMesh.nN), 'k|', ms=10)
ax.set_xlabel("time (s)")
ax.set_ylabel("Normalized current")
ax.grid("k", alpha=0.5)

create data vector and run a forward simulation

  • assign uncertainties to the data
In [34]:
std = 0.05
floor = 1e-12
dobs = data[rxind, :-3]  # ignore the last three time-channels
uncert = abs(dobs) * std + floor

run a forward simulation.

This lets us sanity-check our setup. We don't expect the data to match, but should be similar in order-of-magnitude.

In [35]:
%time
dpred_0 = survey.dpred(m0)
CPU times: user 10 µs, sys: 2 µs, total: 12 µs
Wall time: 11.9 µs
In [36]:
fig, ax = plt.subplots(1, 1)

ax.loglog(time_channels[:-3, 1], dobs, "C0s", label="dobs")
plt.errorbar(time_channels[:-3, 1], dobs, yerr=uncert, color="C0")
ax.loglog(time_channels[:-3, 1], -dpred_0, "C1", label="dpred")

ax.grid('k', alpha=0.5)
ax.legend()
Out[36]:
<matplotlib.legend.Legend at 0x12c00fbe0>

Data uncertainties

In [37]:
survey.dobs = -dobs
dmisfit = DataMisfit.l2_DataMisfit(survey)
dmisfit.W = 1./uncert
SimPEG.DataMisfit.l2_DataMisfit assigning default std of 5%
SimPEG.DataMisfit.l2_DataMisfit assigning default eps of 1e-5 * ||dobs||

Regularization

In [38]:
reg_mesh = Mesh.TensorMesh([mesh.hz[mapping.maps[-1].indActive]])
reg = Regularization.Tikhonov(reg_mesh, mapping=Maps.IdentityMap(reg_mesh))
reg.alpha_s = 1e-3
reg.alpha_x = 1.
reg.mref = m0.copy()

State the inverse problem

For reference, see the docs on inversion components

In [39]:
opt = Optimization.InexactGaussNewton(maxIter=10)
opt.LSshorten = 0.5
opt.remember('xc')

invProb = InvProblem.BaseInvProblem(dmisfit, reg, opt)
In [40]:
target = Directives.TargetMisfit()  # stop when we hit target misfit
invProb.beta = 5.
# betaest = Directives.BetaEstimate_ByEig(beta0_ratio=1e0)
inv = Inversion.BaseInversion(invProb, directiveList=[target])
# run the inversion
mrec = inv.run(m0)
    SimPEG.InvProblem is setting bfgsH0 to the inverse of the eval2Deriv.
    ***Done using same Solver and solverOpts as the problem***
model has any nan: 0
============================ Inexact Gauss Newton ============================
  #     beta     phi_d     phi_m       f      |proj(x-g)-x|  LS    Comment   
-----------------------------------------------------------------------------
x0 has any nan: 0
   0  5.00e+00  1.65e+03  0.00e+00  1.65e+03    6.30e+02      0              
   1  5.00e+00  1.03e+02  5.69e-01  1.06e+02    8.85e+01      0              
------------------------- STOP! -------------------------
1 : |fc-fOld| = 0.0000e+00 <= tolF*(1+|f0|) = 1.6541e+02
0 : |xc-x_last| = 2.8865e+00 <= tolX*(1+|x0|) = 1.3820e+00
0 : |proj(x-g)-x|    = 8.8485e+01 <= tolG          = 1.0000e-01
0 : |proj(x-g)-x|    = 8.8485e+01 <= 1e3*eps       = 1.0000e-02
0 : maxIter   =      10    <= iter          =      2
------------------------- DONE! -------------------------

plot the data

In [41]:
# extract the data so
dpred = invProb.dpred


fig, ax = plt.subplots(1, 1)

ax.loglog(time_channels[:-3, 1], dobs, "C0s", label="dobs")
ax.loglog(time_channels[:-3, 1], -dpred, "C1", label="dpred")

ax.grid('k', alpha=0.5)
ax.legend()
Out[41]:
<matplotlib.legend.Legend at 0x1133c0d68>

Plot the recovered model

In [42]:
fig, ax = plt.subplots(1, 1, figsize=(4, 6))

mplot = np.repeat(np.exp(mrec), 2, axis=0)
z = np.repeat(mesh.vectorCCz[active][1:], 2, axis=0)
ax.semilogx(mplot, np.r_[z[0], z, z[-1]])
ax.set_ylabel("z (m)")
ax.set_xlabel("conductivity (S/m)")
ax.set_ylim([-50, 1])
ax.grid('k', alpha=0.5)
In [ ]: