# Seismic acquisition fiddling¶

This notebook is to accompany the blog post published on January 8, 2015: It goes in the bin at Agile Geoscience.

The idea is to replicate what we've done so far but with 3 enhancements:

• With a Survey object to hold the various features of a survey.
• With more GeoPandas stuff, and less fussing with (x,y)'s directly.
• Making bins and assigning midpoints to them.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point, LineString
import geopandas as gpd
import pandas as pd
from fiona.crs import from_epsg

%matplotlib inline


## Survey object¶

In [2]:
class Survey:
"""
A seismic survey.
"""

def __init__(self, params):

# Assign the variables from the parameter dict,
# using dict.items() for Python 3 compatibility.
for k, v in params.items():
setattr(self, k, v)

# These are just a convenience; we could use the
# tuples directly, or make objects with attrs.
self.xmi = self.corner[0]
self.ymi = self.corner[1]

self.x = self.size[0]
self.y = self.size[1]

self.SL = self.line_spacing[0]
self.RL = self.line_spacing[1]

self.si = self.point_spacing[0]
self.ri = self.point_spacing[1]

self.shiftx = -self.si/2.
self.shifty = -self.ri/2.

@property
def lines(self):
"""
Returns number of (src, rcvr) lines.
"""
slines = int(self.x/self.SL) + 1
rlines = int(self.y/self.RL) + 1
return slines, rlines

@property
def points_per_line(self):
"""
Returns number of (src, rcvr) points per line.
"""
spoints = int(self.y/self.si) + 2
rpoints = int(self.x/self.ri) + 2
return spoints, rpoints

@property
def src(self):
s = [Point(self.xmi+line*self.SL, self.ymi+s*self.si)
for line in range(self.lines[0])
for s in range(self.points_per_line[0])
]
S = gpd.GeoSeries(s)
S.crs = from_epsg(26911)
return S

@property
def rcvr(self):
r = [Point(self.xmi + r*self.ri + self.shiftx, self.ymi + line*self.RL - self.shifty)
for line in range(self.lines[1])
for r in range(self.points_per_line[1])
]
R = gpd.GeoSeries(r)
R.crs = from_epsg(self.epsg)
return R

@property
def layout(self):
"""
Provide a GeoDataFrame of all points,
labelled as columns and in hierarchical index.
"""
# Feels like there might be a better way to do this...
sgdf = gpd.GeoDataFrame({'geometry': self.src, 'station': 'src'})
rgdf = gpd.GeoDataFrame({'geometry': self.rcvr, 'station': 'rcvr'})

# Concatenate with a hierarchical index
layout.crs = from_epsg(self.epsg)

return layout


Perhaps s and r should be objects too. I think you might want to have survey.receivers.x for the list of x locations, for example.

## Instantiate and plot¶

In [3]:
params = {'corner': (5750000,4710000),
'size': (3000,1800),
'line_spacing': (600,600),
'point_spacing': (100,100),
'epsg': 26911 # http://spatialreference.org/ref/epsg/26911/
}

survey = Survey(params)

In [4]:
s = survey.src
r = survey.rcvr
r[:10]

Out[4]:
0    POINT (5749950 4710050)
1    POINT (5750050 4710050)
2    POINT (5750150 4710050)
3    POINT (5750250 4710050)
4    POINT (5750350 4710050)
5    POINT (5750450 4710050)
6    POINT (5750550 4710050)
7    POINT (5750650 4710050)
8    POINT (5750750 4710050)
9    POINT (5750850 4710050)
dtype: object
In [5]:
layout = survey.layout
layout[:10]

Out[5]:
geometry station
sources 0 POINT (5750000 4710000) src
1 POINT (5750000 4710100) src
2 POINT (5750000 4710200) src
3 POINT (5750000 4710300) src
4 POINT (5750000 4710400) src
5 POINT (5750000 4710500) src
6 POINT (5750000 4710600) src
7 POINT (5750000 4710700) src
8 POINT (5750000 4710800) src
9 POINT (5750000 4710900) src

With a hierarchical index you can do cool things, e.g. show the last five sources:

In [6]:
layout.ix['sources'][-5:]

Out[6]:
geometry station
115 POINT (5753000 4711500) src
116 POINT (5753000 4711600) src
117 POINT (5753000 4711700) src
118 POINT (5753000 4711800) src
119 POINT (5753000 4711900) src
In [7]:
layout.crs

Out[7]:
{'init': 'epsg:26911', 'no_defs': True}
In [8]:
ax = layout.plot()


Export GeoDataFrames to GIS shapefile.

In [9]:
# gdf.to_file('src_and_rcvr.shp')


## Midpoint calculations¶

We need midpoints. There is a midpoint between every source-receiver pair.

Hopefully it's not too inelegant to get to the midpoints now that we're using this layout object thing.

In [10]:
midpoint_list = [LineString([r, s]).interpolate(0.5, normalized=True)
for s in layout.ix['sources'].geometry
]


As well as knowing the (x,y) of the midpoints, we'd also like to record the distance from each s to each live r (each r in the live patch). This is easy enough to compute:

Point(x1, y1).distance(Point(x2, y2))



Then we can make a list of all the offsets when we count the midpoints into the bins.

In [11]:
offsets = [r.distance(s)
for s in layout.ix['sources'].geometry
]

In [12]:
azimuths = [(180.0/np.pi) * np.arctan((r.x - s.x)/(r.y - s.y))
for s in layout.ix['sources'].geometry
]

In [13]:
offsetx = np.array(offsets)*np.cos(np.array(azimuths)*np.pi/180.)
offsety = np.array(offsets)*np.sin(np.array(azimuths)*np.pi/180.)


Make a Geoseries of the midpoints, offsets and azimths:

In [14]:
midpoints = gpd.GeoDataFrame({
'geometry' : midpoint_list,
'offset' : offsets,
'azimuth': azimuths,
'offsetx' : offsetx,
'offsety' : offsety
})
midpoints[:5]

Out[14]:
azimuth geometry offset offsetx offsety
0 -45.000000 POINT (5749975 4710025) 70.710678 50 -50
1 45.000000 POINT (5749975 4710075) 70.710678 50 50
2 18.434949 POINT (5749975 4710125) 158.113883 150 50
3 11.309932 POINT (5749975 4710175) 254.950976 250 50
4 8.130102 POINT (5749975 4710225) 353.553391 350 50
In [15]:
ax = midpoints.plot()


Save to a shapefile if desired.

In [16]:
#midpt.to_file('CMPs.shp')


## Spider plot¶

In [17]:
midpoints[:5].offsetx # Easy!

Out[17]:
0     50
1     50
2    150
3    250
4    350
Name: offsetx, dtype: float64
In [18]:
midpoints.ix[3].geometry.x # Less easy :(

Out[18]:
5749975.0

We need lists (or arrays) to pass into the matplotlib quiver plot. This takes four main parameters: x, y, u, and v, where x, y will be our coordinates, and u, v will be the offset vector for that midpoint.

We can get at the GeoDataFrame's attributes easily, but I can't see how to get at the coordinates in the geometry GeoSeries (seems like a user error â€”Â it feels like it should be really easy) so I am resorting to this:

In [19]:
x = [m.geometry.x for i, m in midpoints.iterrows()]
y = [m.geometry.y for i, m in midpoints.iterrows()]

In [20]:
fig = plt.figure(figsize=(12,8))
plt.quiver(x, y, midpoints.offsetx, midpoints.offsety, units='xy', width=0.5, scale=1/0.025, pivot='mid', headlength=0)
plt.axis('equal')
plt.show()


## Bins¶

The bins are a new geometry, related to but separate from the survey itself, and the midpoints. We will model them as a GeoDataFrame of polygons. The steps are:

1. Compute the bin centre locations with our usual list comprehension trick.
2. Buffer the centres with a square.
3. Gather the buffered polygons into a GeoDataFrame.
In [21]:
# Factor to shift the bins relative to source and receiver points
jig = survey.si / 4.
bin_centres = gpd.GeoSeries([Point(survey.xmi + 0.5*r*survey.ri - jig, survey.ymi + 0.5*s*survey.si + jig)
for r in range(2*(survey.points_per_line[1]-1))
for s in range(2*(survey.points_per_line[0]-1))
])

# Buffers are diamond shaped so we have to scale and rotate them.
scale_factor = np.sin(np.pi/4.)/2.
bin_polys = bin_centres.buffer(scale_factor*survey.ri, 1).rotate(-45)
bins = gpd.GeoDataFrame(geometry=bin_polys)

bins[:3]

Out[21]:
geometry
0 POLYGON ((5750000 4709999.999999999, 5749950 4...
1 POLYGON ((5750000 4710049.999999999, 5749950 4...
2 POLYGON ((5750000 4710100, 5749950 4710100, 57...

Suspect there's a super easy way to get all midpoints in a bin poly, without stepping over all bins.

WARNING This step is very slow for more than a few thousand midpoints.

In [22]:
# Make a copy because I'm going to drop points as I
# assign them to polys, to speed up subsequent search.
midpts = midpoints.copy()

offsets, azimuths = [], [] # To hold complete list.

# Loop over bin polygons with index i.
for i, bin_i in bins.iterrows():

o, a = [], [] # To hold list for this bin only.

# Now loop over all midpoints with index j.
for j, midpt_j in midpts.iterrows():
if bin_i.geometry.contains(midpt_j.geometry):
# Then it's a hit! Add it to the lists,
# and drop it so we have less hunting.
o.append(midpt_j.offset)
a.append(midpt_j.azimuth)
midpts = midpts.drop([j])

# Add the bin_i lists to the master list
# and go around the outer loop again.
offsets.append(o)
azimuths.append(a)

# Add everything to the dataframe.
bins['offsets'] = gpd.GeoSeries(offsets)
bins['azimuths'] = gpd.GeoSeries(azimuths)

In [23]:
bins[:10]

Out[23]:
geometry offsets azimuths
0 POLYGON ((5750000 4709999.999999999, 5749950 4... [70.7106781187] [-45.0]
1 POLYGON ((5750000 4710049.999999999, 5749950 4... [70.7106781187] [45.0]
2 POLYGON ((5750000 4710100, 5749950 4710100, 57... [158.113883008] [18.4349488229]
3 POLYGON ((5750000 4710150, 5749950 4710150, 57... [254.95097568] [11.309932474]
4 POLYGON ((5750000 4710200, 5749949.999999999 4... [353.553390593] [8.13010235416]
5 POLYGON ((5750000 4710250, 5749950 4710250, 57... [452.769256907] [6.34019174591]
6 POLYGON ((5750000 4710300, 5749950 4710300, 57... [552.268050859, 651.92024052] [5.19442890773, -4.398705355]
7 POLYGON ((5750000 4710349.999999999, 5749950 4... [651.92024052, 552.268050859] [4.398705355, -5.19442890773]
8 POLYGON ((5750000 4710400, 5749950 4710400, 57... [751.664818919, 452.769256907] [3.81407483429, -6.34019174591]
9 POLYGON ((5750000 4710450, 5749950 4710450, 57... [851.469318296, 353.553390593] [3.36646066343, -8.13010235416]

We can compute the fold from the length of the list of offsets in each bin. We use a mini-function, called a lambda, to do this. This piece of code applies a lambda to each row in the GeoDataFrame. Essentially it says:

set each row in the 'fold' column in my bins GeoDataFrame to the length of the offsets list for that row. 
In [24]:
bins['fold'] = bins.apply(lambda row: len(row.offsets), axis=1)


Now we can use the GeoDataFrame's built-in plot() method to plot these:

In [25]:
ax = bins.plot(column="fold")


We can use a similar trick to compute the minimum offset, but with an added test for there being valid data in the bin:

In [26]:
bins['min_offset'] = bins.apply(lambda row: min(row.offsets) if row.fold > 0 else None, axis=1)

In [27]:
ax = bins.plot(column="min_offset")

In [ ]: