#!/usr/bin/env python
# coding: utf-8
#
#
#
#
#
#
2016 - IPGP - ObsPy Tutorial
#
#
Introduction to File Formats and read/write in ObsPy
#
#
# This is oftentimes not taught, but fairly important to understand, at least at a basic level. This also teaches you how to work with these in ObsPy.
# In[ ]:
#%matplotlib inline
from __future__ import print_function
import matplotlib.pylab as plt
plt.switch_backend("nbagg")
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = 12, 8
# ## SEED Identifiers
#
# According to the [SEED standard](www.fdsn.org/seed_manual/SEEDManual_V2.4.pdf), which is fairly well adopted, the following nomenclature is used to identify seismic receivers:
#
# * **Network code**: Identifies the network/owner of the data. Assigned by the FDSN and thus unique.
# * **Station code**: The station within a network. *NOT UNIQUE IN PRACTICE!* Always use together with a network code!
# * **Location ID**: Identifies different data streams within one station. Commonly used to logically separate multiple instruments at a single station.
# * **Channel codes**: Three character code: 1) Band and approximate sampling rate, 2) The type of instrument, 3) The orientation
#
# This results in full ids of the form **NET.STA.LOC.CHAN**, e.g. **IV.PRMA..HHE**.
#
#
# ---
#
#
# In seismology we generally distinguish between three separate types of data:
#
# 1. **Waveform Data** - The actual waveforms as time series.
# 2. **Station Data** - Information about the stations' operators, geographical locations, and the instrument's responses.
# 3. **Event Data** - Information about earthquakes.
#
# Some formats have elements of two or more of these.
# ## Waveform Data
#
# ![stream](images/Stream_Trace.svg)
#
# There are a myriad of waveform data formats but in Europe and the USA two formats dominate: **MiniSEED** and **SAC**
#
#
# ### MiniSEED
#
# * This is what you get from datacenters and also what they store, thus the original data
# * Can store integers and single/double precision floats
# * Integer data (e.g. counts from a digitizer) are heavily compressed: a factor of 3-5 depending on the data
# * Can deal with gaps and overlaps
# * Multiple components per file
# * Contains only the really necessary parameters and some information for the data providers
# In[ ]:
import obspy
# ObsPy automatically detects the file format.
st = obspy.read("data/example.mseed")
print(st)
# Fileformat specific information is stored here.
print(st[0].stats.mseed)
# In[ ]:
st.plot()
# In[ ]:
# This is a quick interlude to teach you basics about how to work
# with Stream/Trace objects.
# Most operations work in-place, e.g. they modify the existing
# objects. We'll create a copy here.
st2 = st.copy()
# To use only part of a Stream, use the select() function.
print(st2.select(component="Z"))
# Stream objects behave like a list of Trace objects.
tr = st2[0]
tr.plot()
# Some basic processing. Please note that these modify the
# existing object.
tr.detrend("linear")
tr.taper(type="hann", max_percentage=0.05)
tr.filter("lowpass", freq=0.5)
tr.plot()
# In[ ]:
# You can write it again by simply specifing the format.
st.write("temp.mseed", format="mseed")
# ### SAC
#
# * Custom format of the `sac` code.
# * Simple header and single precision floating point data.
# * Only a single component per file and no concept of gaps/overlaps.
# * Used a lot due to `sac` being very popular and the additional basic information that can be stored in the header.
# In[ ]:
st = obspy.read("data/example.sac")
print(st)
st[0].stats.sac.__dict__
# In[ ]:
st.plot()
# In[ ]:
# You can once again write it with the write() method.
st.write("temp.sac", format="sac")
# ## Station Data
#
# ![inv](images/Inventory.svg)
#
# Station data contains information about the organziation that collections the data, geographical information, as well as the instrument response. It mainly comes in three formats:
#
# * `(dataless) SEED`: Very complete but pretty complex and binary. Still used a lot, e.g. for the Arclink protocol
# * `RESP`: A strict subset of SEED. ASCII based. Contains **ONLY** the response.
# * `StationXML`: Essentially like SEED but cleaner and based on XML. Most modern format and what the datacenters nowadays serve. **Use this if you can.**
#
#
# ObsPy can work with all of them but today we will focus on StationXML.
# They are XML files:
# In[ ]:
get_ipython().system('head data/all_stations.xml')
# In[ ]:
import obspy
# Use the read_inventory function to open them.
inv = obspy.read_inventory("data/all_stations.xml")
print(inv)
# You can see that they can contain an arbirary number of networks, stations, and channels.
# In[ ]:
# ObsPy is also able to plot a map of them.
inv.plot(projection="local");
# In[ ]:
# As well as a plot the instrument response.
inv.select(network="IV", station="SALO", channel="BH?").plot_response(0.001);
# In[ ]:
# Coordinates of single channels can also be extraced. This function
# also takes a datetime arguments to extract information at different
# points in time.
inv.get_coordinates("IV.SALO..BHZ")
# In[ ]:
# And it can naturally be written again, also in modified state.
inv.select(channel="BHZ").write("temp.xml", format="stationxml")
# ## Event Data
#
# ![events](./images/Event.svg)
#
# Event data is essentially served in either very simple formats like NDK or the CMTSOLUTION format used by many waveform solvers:
# In[ ]:
get_ipython().system('cat data/GCMT_2014_04_01__Mw_8_1')
# Datacenters on the hand offer **QuakeML** files, which are surprisingly complex in structure but can store complex relations.
# In[ ]:
# Read QuakeML files with the read_events() function.
cat = obspy.read_events("data/GCMT_2014_04_01__Mw_8_1.xml")
print(cat)
# In[ ]:
print(cat[0])
# In[ ]:
cat.plot(projection="ortho");
# In[ ]:
# Once again they can be written with the write() function.
cat.write("temp_quake.xml", format="quakeml")
# To show off some more things, I added a file containing all events from 2014 in the GCMT catalog.
# In[ ]:
import obspy
cat = obspy.read_events("data/2014.ndk")
print(cat)
# In[ ]:
cat.plot();
# In[ ]:
cat.filter("depth > 100000", "magnitude > 7")
# ## Acknowledgements
#
# Background picture at the very top is from Matthias Meschede.