pySeabird is a package to parse/load CTD data files. It should be an easy task but the problem is that the format have been changing along the time. Work with multiple ships/cruises data requires first to understand each file, to normalize it into a common format for only than start your analysis. That can still be done with few general regular expression rules, but I would rather use strict rules. If I'm loading hundreds or thousands of profiles, I want to be sure that no mistake passed by. I rather ignore a file in doubt and warn it, than belive that it was loaded right and be part of my analysis.
With that in mind, I wrote this package with the ability to load multiple rules, so new rules can be added without change the main engine.
For more information, check the documentatio
%matplotlib inline
from seabird.cnv import fCNV
from gsw import z_from_p
Let's first download an example file with some CTD data
!wget https://raw.githubusercontent.com/castelao/seabird/master/tests/test_data/dPIRX003.cnv.OK
--2015-02-20 18:49:22-- https://raw.githubusercontent.com/castelao/seabird/master/tests/test_data/dPIRX003.cnv.OK Resolving raw.githubusercontent.com... 199.27.79.133 Connecting to raw.githubusercontent.com|199.27.79.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 47291 (46K) [text/plain] Saving to: 'dPIRX003.cnv.OK' dPIRX003.cnv.OK 100%[=====================>] 46.18K --.-KB/s in 0.08s 2015-02-20 18:49:22 (560 KB/s) - 'dPIRX003.cnv.OK' saved [47291/47291]
profile = fCNV('dPIRX003.cnv.OK')
Using rules from: cnv.yaml
The profile dPIRX003.cnv.OK was loaded with the default rule cnv.yaml
print "Header: %s" % profile.attributes.keys()
print "Data: %s" % profile.keys()
Header: ['file_type', 'seasave', 'start_time', 'nquan', 'longitude', 'datetime', 'bad_flag', 'nvalues', 'latitude', 'filename', 'md5'] Data: ['timeS', 'pressure', 'temperature', 'temperature2', 'conductivity', 'conductivity2', 'potemperature', 'potemperature2', 'salinity', 'salinity2', 'flag']
We have latitude in the header, and pressure in the data.
z = z_from_p(profile['pressure'], profile.attributes['latitude'])
from matplotlib import pyplot as plt
plt.plot(profile['temperature'], z,'b')
plt.plot(profile['temperature2'], z,'g')
plt.xlabel('temperature')
plt.ylabel('depth')
plt.title(profile.attributes['filename'])
<matplotlib.text.Text at 0x10076f290>