#!/usr/bin/env python # coding: utf-8 # # Quality Controlling Saildrone T/S # ## Objective: # # This notebook shows how to use CoTeDe to evaluate temperature and salinity measured along-track from a Saildrone. # # The nature of this dataset is similar to a Thermosalinograph (TSG) on vessels of opportunity. As the vessel sails, it pumps water from near the surface, which is measured by a CTD. Thus, it is a time-series with a nearly constant depth, and each measurement is associated with time, latitude, and longitude. # ## Data: # # For this tutorial, let's use the Saildrone Antarctic Cirumnaviation mission (https://www.saildrone.com/antarctica). I don't want to bypass their data distribution, so I'll let you download it yourself. Please place it in the same directory (folder) of this notebook. # # Let's use the 24hrs resolution just for demonstration purposes since this is the public version. We will probably get better results by quality controlling on the high-resolution measurements and only then, if convenient for our scientific questions, sub-sample for lower resolution. # # The data is available at https://data.saildrone.com/data/sets/antarctica-circumnavigation-2019/access # Let's import xarray, which we'll use to load the data from the netCDF. We could use netCDF4 or scipy, but it is probably more intuitive with xarray. # # Let's also import ProfileQC from CoTeDe. Yes, I know, Saildrone does not measure profiles but don't worry about the name of this class; it will work with the same principle. Maybe one day, I'll create another class to deal with the along-track type of measurements. # In[1]: import xarray as xr from cotede.qc import ProfileQC # ### First, learn about the data # # Load the data # In[2]: ds = xr.open_dataset('saildrone-antarctica.nc') # Let's learn about this dataset, starting from the attributes. # In[3]: ds.attrs['Conventions'] # Great, it follows the CF and ACDD conventions, so we don't need to wander around, but we know what to expect and where to find the information that we will need. For instance, does it conform with some Simple Geometry? If so, which one? # In[4]: ds.attrs['featureType'] # OK, this is a trajectory, so we expect that each measurement will have a time and position. # # What are the available variables? We are interested in the temperature and the salinity of the seawater. # In[5]: list(ds.keys()) # It looks like we are interested in TEMP_CTD_MEAN and SAL_MEAN. Let's confirm that. We can learn a lot by inspecting the Attributes. # In[6]: print(ds["SAL_MEAN"]) print("====") print(ds["TEMP_CTD_MEAN"]) # Yes, we can see in the attributes of both variables the standard_name and long_name. We found what we need. Let's simplify our dataset and extract only what we need - temperature and salinity - and call it "tsg". # In[7]: tsg = ds[['TEMP_CTD_MEAN', 'SAL_MEAN']] tsg # Notice that there is a trajectory dimension. Since this is a single trajectory, CF does not require to keep this dimension, but this is a good practice. In case we want to merge this dataset with another trajectory, let's say another Saildrone from another year, and both trajectories would merge seamlessly with two trajectories. # # To simplify, let's remove the trajectory dimension by choosing only the first (and only one) trajectory. # In[8]: tsg = tsg.isel(trajectory=0) # Now, if we look at the temperature, it will have only the dimension obs. # In[9]: tsg['TEMP_CTD_MEAN'] # In[10]: tsg['SAL_MEAN'].attrs # In[11]: tsg['TEMP_CTD_MEAN'][:10] # ## Actuall QC # # So far, we have been learning about this dataset and subsampling. # If you were familiar with this dataset, you could have skipped all that and started here. # # Now, let's QC this data, the easiest part (if using CoTeDe). # In[12]: pqc = ProfileQC(tsg, {'sea_water_temperature':{'gradient': {'threshold': 5}}}) # Great! You just finished to QC the temperature of the whole Saildrone Antarctic mission. It's probably not the best approach to use the gradient test only, but good enough for this example. # # What are the flags available? # In[13]: pqc.flags.keys() # Yes, it seems right. We asked to inspect all variables that were the type: seawater temperature. # # What was the result, i.e. what are the flags assigned? # In[14]: pqc.flags['TEMP_CTD_MEAN']['gradient'] # Let's improve this. Let's evaluate temperature and salinity at the same time, but now let's add another test, the rate of change. # In[15]: cfg = { 'sea_water_temperature':{ 'gradient': {'threshold': 5}, 'rate_of_change': {'threshold': 5}}, 'SAL_MEAN': { 'rate_of_change': {'threshold': 2}} } pqc = ProfileQC(tsg, cfg) # In[16]: pqc.flags # Nice, you can choose which tests to apply on each variable, and that includes which parameters to use on each test. # # You also can choose between defining a test for the type of measurement (sea_water_temperature) or the variable specifically (SAL_MEAN). That is convenient when you have a platform equipped with several sensors, like Saildrone. # Finally, let's check what we got! # In[17]: import matplotlib.pyplot as plt plt.figure(figsize=(14,4)) idx = pqc.flags['TEMP_CTD_MEAN']['overall'] <= 2 plt.plot(pqc['time'][idx], pqc['TEMP_CTD_MEAN'][idx], '.') plt.title('Temperature [$^\circ$C]') # In[18]: plt.figure(figsize=(14,4)) idx = pqc.flags['SAL_MEAN']['overall'] <= 2 plt.plot(pqc['time'][idx], pqc['SAL_MEAN'][idx], '.') plt.title('Salinity') # Yes, I agree, we need to activate more checks if we want to do a better job here. Don't worry, there are plenty built-in tests in CoTeDe. # In[ ]: