We'd like to caluclate a 90th percentile climatology of a dataset, with the added complexity that rather than just grouping all the Jan 1st of each year together we want to add a window of 15 days either side, so that the Jan 1st result is the 90th percentile of all the Dec 16th to Jan 16th of all years.
# Imports
import xarray
# Start a dask client
import climtas.nci
climtas.nci.Client()
Client-982468aa-0f71-11ec-89ad-fa163eaaaec0
Connection method: Cluster object | Cluster type: distributed.LocalCluster |
Dashboard: /node/ood-vn8/17103/proxy/8787/status |
3fc6f855
Dashboard: /node/ood-vn8/17103/proxy/8787/status | Workers: 4 |
Total threads: 4 | Total memory: 11.23 GiB |
Status: running | Using processes: True |
Scheduler-1357819f-28e0-4388-8de3-a2cf2fa7def9
Comm: tcp://127.0.0.1:44551 | Workers: 4 |
Dashboard: /node/ood-vn8/17103/proxy/8787/status | Total threads: 4 |
Started: Just now | Total memory: 11.23 GiB |
Comm: tcp://10.0.128.136:40323 | Total threads: 1 |
Dashboard: /node/ood-vn8/17103/proxy/33999/status | Memory: 2.81 GiB |
Nanny: tcp://127.0.0.1:45285 | |
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-gguw326_ |
Comm: tcp://10.0.128.136:35241 | Total threads: 1 |
Dashboard: /node/ood-vn8/17103/proxy/45751/status | Memory: 2.81 GiB |
Nanny: tcp://127.0.0.1:43429 | |
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-19ps26cb |
Comm: tcp://10.0.128.136:36907 | Total threads: 1 |
Dashboard: /node/ood-vn8/17103/proxy/46587/status | Memory: 2.81 GiB |
Nanny: tcp://127.0.0.1:45357 | |
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-gftvk908 |
Comm: tcp://10.0.128.136:41941 | Total threads: 1 |
Dashboard: /node/ood-vn8/17103/proxy/45603/status | Memory: 2.81 GiB |
Nanny: tcp://127.0.0.1:46317 | |
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-c0pfsb5n |
For the demo I'll use ACCESS CMIP6 surface temperature. I've chosen a chunking in time as that's usually how the data is laid out in the file - you have the whole grid for Jan 1st, then the whole grid for Jan 2nd and so on. It's faster to read data if it's close together in the file, so we don't want to hop all over getting all the times for a single grid point.
The chunk size is aiming to get the chunk bytes in the array table to very roughly around 100 mb - 30 mb is fine for this (I chose 300 to start with because that's roughly a year)
ds = xarray.open_mfdataset('/g/data/fs38/publications/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/amip/r1i1p1f1/day/tas/gn/latest/*.nc', chunks={'time': 300})
ds.tas
<xarray.DataArray 'tas' (time: 13149, lat: 145, lon: 192)> dask.array<open_dataset-de3bca054c2ed43774883de8240854a6tas, shape=(13149, 145, 192), dtype=float32, chunksize=(300, 145, 192), chunktype=numpy.ndarray> Coordinates: * time (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-12-31T12:00:00 * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0 * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1 height float64 ... Attributes: standard_name: air_temperature long_name: Near-Surface Air Temperature comment: near-surface (usually, 2 meter) air temperature units: K cell_methods: area: time: mean cell_measures: area: areacella history: 2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...
|
array(['1979-01-01T12:00:00.000000000', '1979-01-02T12:00:00.000000000', '1979-01-03T12:00:00.000000000', ..., '2014-12-29T12:00:00.000000000', '2014-12-30T12:00:00.000000000', '2014-12-31T12:00:00.000000000'], dtype='datetime64[ns]')
array([-90. , -88.75, -87.5 , -86.25, -85. , -83.75, -82.5 , -81.25, -80. , -78.75, -77.5 , -76.25, -75. , -73.75, -72.5 , -71.25, -70. , -68.75, -67.5 , -66.25, -65. , -63.75, -62.5 , -61.25, -60. , -58.75, -57.5 , -56.25, -55. , -53.75, -52.5 , -51.25, -50. , -48.75, -47.5 , -46.25, -45. , -43.75, -42.5 , -41.25, -40. , -38.75, -37.5 , -36.25, -35. , -33.75, -32.5 , -31.25, -30. , -28.75, -27.5 , -26.25, -25. , -23.75, -22.5 , -21.25, -20. , -18.75, -17.5 , -16.25, -15. , -13.75, -12.5 , -11.25, -10. , -8.75, -7.5 , -6.25, -5. , -3.75, -2.5 , -1.25, 0. , 1.25, 2.5 , 3.75, 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. , 16.25, 17.5 , 18.75, 20. , 21.25, 22.5 , 23.75, 25. , 26.25, 27.5 , 28.75, 30. , 31.25, 32.5 , 33.75, 35. , 36.25, 37.5 , 38.75, 40. , 41.25, 42.5 , 43.75, 45. , 46.25, 47.5 , 48.75, 50. , 51.25, 52.5 , 53.75, 55. , 56.25, 57.5 , 58.75, 60. , 61.25, 62.5 , 63.75, 65. , 66.25, 67.5 , 68.75, 70. , 71.25, 72.5 , 73.75, 75. , 76.25, 77.5 , 78.75, 80. , 81.25, 82.5 , 83.75, 85. , 86.25, 87.5 , 88.75, 90. ])
array([ 0. , 1.875, 3.75 , 5.625, 7.5 , 9.375, 11.25 , 13.125, 15. , 16.875, 18.75 , 20.625, 22.5 , 24.375, 26.25 , 28.125, 30. , 31.875, 33.75 , 35.625, 37.5 , 39.375, 41.25 , 43.125, 45. , 46.875, 48.75 , 50.625, 52.5 , 54.375, 56.25 , 58.125, 60. , 61.875, 63.75 , 65.625, 67.5 , 69.375, 71.25 , 73.125, 75. , 76.875, 78.75 , 80.625, 82.5 , 84.375, 86.25 , 88.125, 90. , 91.875, 93.75 , 95.625, 97.5 , 99.375, 101.25 , 103.125, 105. , 106.875, 108.75 , 110.625, 112.5 , 114.375, 116.25 , 118.125, 120. , 121.875, 123.75 , 125.625, 127.5 , 129.375, 131.25 , 133.125, 135. , 136.875, 138.75 , 140.625, 142.5 , 144.375, 146.25 , 148.125, 150. , 151.875, 153.75 , 155.625, 157.5 , 159.375, 161.25 , 163.125, 165. , 166.875, 168.75 , 170.625, 172.5 , 174.375, 176.25 , 178.125, 180. , 181.875, 183.75 , 185.625, 187.5 , 189.375, 191.25 , 193.125, 195. , 196.875, 198.75 , 200.625, 202.5 , 204.375, 206.25 , 208.125, 210. , 211.875, 213.75 , 215.625, 217.5 , 219.375, 221.25 , 223.125, 225. , 226.875, 228.75 , 230.625, 232.5 , 234.375, 236.25 , 238.125, 240. , 241.875, 243.75 , 245.625, 247.5 , 249.375, 251.25 , 253.125, 255. , 256.875, 258.75 , 260.625, 262.5 , 264.375, 266.25 , 268.125, 270. , 271.875, 273.75 , 275.625, 277.5 , 279.375, 281.25 , 283.125, 285. , 286.875, 288.75 , 290.625, 292.5 , 294.375, 296.25 , 298.125, 300. , 301.875, 303.75 , 305.625, 307.5 , 309.375, 311.25 , 313.125, 315. , 316.875, 318.75 , 320.625, 322.5 , 324.375, 326.25 , 328.125, 330. , 331.875, 333.75 , 335.625, 337.5 , 339.375, 341.25 , 343.125, 345. , 346.875, 348.75 , 350.625, 352.5 , 354.375, 356.25 , 358.125])
array(2.)
We'll start out with the windowing, gathering together 15 days either side of each day. We can do this with '.rolling()'. We give it the total size of the window, here 31 for the day plus 15 days either side.
ds.tas.rolling(time=31, center=True)
DataArrayRolling [time->31(center)]
if we liked we could get stats for each window just by calling '.mean()' etc. here. We're wanting to do something a bit more complex however with a climatology, so we'll use the construct()
function. This rearranges the array to add a new axis that contains all of the samples for the window
ds.tas.rolling(time=31, center=True).construct(time='window')
<xarray.DataArray 'tas' (time: 13149, lat: 145, lon: 192, window: 31)> dask.array<sliding_window_view, shape=(13149, 145, 192, 31), dtype=float32, chunksize=(315, 145, 192, 31), chunktype=numpy.ndarray> Coordinates: * time (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-12-31T12:00:00 * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0 * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1 height float64 ... Dimensions without coordinates: window Attributes: standard_name: air_temperature long_name: Near-Surface Air Temperature comment: near-surface (usually, 2 meter) air temperature units: K cell_methods: area: time: mean cell_measures: area: areacella history: 2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...
|
array(['1979-01-01T12:00:00.000000000', '1979-01-02T12:00:00.000000000', '1979-01-03T12:00:00.000000000', ..., '2014-12-29T12:00:00.000000000', '2014-12-30T12:00:00.000000000', '2014-12-31T12:00:00.000000000'], dtype='datetime64[ns]')
array([-90. , -88.75, -87.5 , -86.25, -85. , -83.75, -82.5 , -81.25, -80. , -78.75, -77.5 , -76.25, -75. , -73.75, -72.5 , -71.25, -70. , -68.75, -67.5 , -66.25, -65. , -63.75, -62.5 , -61.25, -60. , -58.75, -57.5 , -56.25, -55. , -53.75, -52.5 , -51.25, -50. , -48.75, -47.5 , -46.25, -45. , -43.75, -42.5 , -41.25, -40. , -38.75, -37.5 , -36.25, -35. , -33.75, -32.5 , -31.25, -30. , -28.75, -27.5 , -26.25, -25. , -23.75, -22.5 , -21.25, -20. , -18.75, -17.5 , -16.25, -15. , -13.75, -12.5 , -11.25, -10. , -8.75, -7.5 , -6.25, -5. , -3.75, -2.5 , -1.25, 0. , 1.25, 2.5 , 3.75, 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. , 16.25, 17.5 , 18.75, 20. , 21.25, 22.5 , 23.75, 25. , 26.25, 27.5 , 28.75, 30. , 31.25, 32.5 , 33.75, 35. , 36.25, 37.5 , 38.75, 40. , 41.25, 42.5 , 43.75, 45. , 46.25, 47.5 , 48.75, 50. , 51.25, 52.5 , 53.75, 55. , 56.25, 57.5 , 58.75, 60. , 61.25, 62.5 , 63.75, 65. , 66.25, 67.5 , 68.75, 70. , 71.25, 72.5 , 73.75, 75. , 76.25, 77.5 , 78.75, 80. , 81.25, 82.5 , 83.75, 85. , 86.25, 87.5 , 88.75, 90. ])
array([ 0. , 1.875, 3.75 , 5.625, 7.5 , 9.375, 11.25 , 13.125, 15. , 16.875, 18.75 , 20.625, 22.5 , 24.375, 26.25 , 28.125, 30. , 31.875, 33.75 , 35.625, 37.5 , 39.375, 41.25 , 43.125, 45. , 46.875, 48.75 , 50.625, 52.5 , 54.375, 56.25 , 58.125, 60. , 61.875, 63.75 , 65.625, 67.5 , 69.375, 71.25 , 73.125, 75. , 76.875, 78.75 , 80.625, 82.5 , 84.375, 86.25 , 88.125, 90. , 91.875, 93.75 , 95.625, 97.5 , 99.375, 101.25 , 103.125, 105. , 106.875, 108.75 , 110.625, 112.5 , 114.375, 116.25 , 118.125, 120. , 121.875, 123.75 , 125.625, 127.5 , 129.375, 131.25 , 133.125, 135. , 136.875, 138.75 , 140.625, 142.5 , 144.375, 146.25 , 148.125, 150. , 151.875, 153.75 , 155.625, 157.5 , 159.375, 161.25 , 163.125, 165. , 166.875, 168.75 , 170.625, 172.5 , 174.375, 176.25 , 178.125, 180. , 181.875, 183.75 , 185.625, 187.5 , 189.375, 191.25 , 193.125, 195. , 196.875, 198.75 , 200.625, 202.5 , 204.375, 206.25 , 208.125, 210. , 211.875, 213.75 , 215.625, 217.5 , 219.375, 221.25 , 223.125, 225. , 226.875, 228.75 , 230.625, 232.5 , 234.375, 236.25 , 238.125, 240. , 241.875, 243.75 , 245.625, 247.5 , 249.375, 251.25 , 253.125, 255. , 256.875, 258.75 , 260.625, 262.5 , 264.375, 266.25 , 268.125, 270. , 271.875, 273.75 , 275.625, 277.5 , 279.375, 281.25 , 283.125, 285. , 286.875, 288.75 , 290.625, 292.5 , 294.375, 296.25 , 298.125, 300. , 301.875, 303.75 , 305.625, 307.5 , 309.375, 311.25 , 313.125, 315. , 316.875, 318.75 , 320.625, 322.5 , 324.375, 326.25 , 328.125, 330. , 331.875, 333.75 , 335.625, 337.5 , 339.375, 341.25 , 343.125, 345. , 346.875, 348.75 , 350.625, 352.5 , 354.375, 356.25 , 358.125])
array(2.)
See how at the top there is now a 'window' dimension of length 31? That's all of the samples in the window for that (time, lat, lon) location. Calling ds.tas.rolling(time=31, center=True).mean()
is equivalent to ds.tas.rolling(time=31, center=True).construct(time='window').mean('window')
Let's save this as a variable
tas_window = ds.tas.rolling(time=31, center=True).construct(time='window')
To calculate the climatology in xarray we use .groupby('time.dayofyear')
. This groups the days by number of days since the start of the year, so for leap years Feb 29th will be grouped together with Mar 1st of non leap-years.
tas_window.groupby('time.dayofyear')
DataArrayGroupBy, grouped over 'dayofyear' 366 groups with labels 1, 2, 3, 4, 5, ..., 363, 364, 365, 366.
If we look at a single day of year here, you can see that we've got two axes we need to gather over. 'time' is now the 36 years in the dataset, and 'window' is the nearby days.
for doy, data in tas_window.groupby('time.dayofyear'):
display(data)
break
<xarray.DataArray 'tas' (time: 36, lat: 145, lon: 192, window: 31)> dask.array<getitem, shape=(36, 145, 192, 31), dtype=float32, chunksize=(1, 145, 192, 31), chunktype=numpy.ndarray> Coordinates: * time (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-01-01T12:00:00 * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0 * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1 height float64 ... Dimensions without coordinates: window Attributes: standard_name: air_temperature long_name: Near-Surface Air Temperature comment: near-surface (usually, 2 meter) air temperature units: K cell_methods: area: time: mean cell_measures: area: areacella history: 2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...
|
array(['1979-01-01T12:00:00.000000000', '1980-01-01T12:00:00.000000000', '1981-01-01T12:00:00.000000000', '1982-01-01T12:00:00.000000000', '1983-01-01T12:00:00.000000000', '1984-01-01T12:00:00.000000000', '1985-01-01T12:00:00.000000000', '1986-01-01T12:00:00.000000000', '1987-01-01T12:00:00.000000000', '1988-01-01T12:00:00.000000000', '1989-01-01T12:00:00.000000000', '1990-01-01T12:00:00.000000000', '1991-01-01T12:00:00.000000000', '1992-01-01T12:00:00.000000000', '1993-01-01T12:00:00.000000000', '1994-01-01T12:00:00.000000000', '1995-01-01T12:00:00.000000000', '1996-01-01T12:00:00.000000000', '1997-01-01T12:00:00.000000000', '1998-01-01T12:00:00.000000000', '1999-01-01T12:00:00.000000000', '2000-01-01T12:00:00.000000000', '2001-01-01T12:00:00.000000000', '2002-01-01T12:00:00.000000000', '2003-01-01T12:00:00.000000000', '2004-01-01T12:00:00.000000000', '2005-01-01T12:00:00.000000000', '2006-01-01T12:00:00.000000000', '2007-01-01T12:00:00.000000000', '2008-01-01T12:00:00.000000000', '2009-01-01T12:00:00.000000000', '2010-01-01T12:00:00.000000000', '2011-01-01T12:00:00.000000000', '2012-01-01T12:00:00.000000000', '2013-01-01T12:00:00.000000000', '2014-01-01T12:00:00.000000000'], dtype='datetime64[ns]')
array([-90. , -88.75, -87.5 , -86.25, -85. , -83.75, -82.5 , -81.25, -80. , -78.75, -77.5 , -76.25, -75. , -73.75, -72.5 , -71.25, -70. , -68.75, -67.5 , -66.25, -65. , -63.75, -62.5 , -61.25, -60. , -58.75, -57.5 , -56.25, -55. , -53.75, -52.5 , -51.25, -50. , -48.75, -47.5 , -46.25, -45. , -43.75, -42.5 , -41.25, -40. , -38.75, -37.5 , -36.25, -35. , -33.75, -32.5 , -31.25, -30. , -28.75, -27.5 , -26.25, -25. , -23.75, -22.5 , -21.25, -20. , -18.75, -17.5 , -16.25, -15. , -13.75, -12.5 , -11.25, -10. , -8.75, -7.5 , -6.25, -5. , -3.75, -2.5 , -1.25, 0. , 1.25, 2.5 , 3.75, 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. , 16.25, 17.5 , 18.75, 20. , 21.25, 22.5 , 23.75, 25. , 26.25, 27.5 , 28.75, 30. , 31.25, 32.5 , 33.75, 35. , 36.25, 37.5 , 38.75, 40. , 41.25, 42.5 , 43.75, 45. , 46.25, 47.5 , 48.75, 50. , 51.25, 52.5 , 53.75, 55. , 56.25, 57.5 , 58.75, 60. , 61.25, 62.5 , 63.75, 65. , 66.25, 67.5 , 68.75, 70. , 71.25, 72.5 , 73.75, 75. , 76.25, 77.5 , 78.75, 80. , 81.25, 82.5 , 83.75, 85. , 86.25, 87.5 , 88.75, 90. ])
array([ 0. , 1.875, 3.75 , 5.625, 7.5 , 9.375, 11.25 , 13.125, 15. , 16.875, 18.75 , 20.625, 22.5 , 24.375, 26.25 , 28.125, 30. , 31.875, 33.75 , 35.625, 37.5 , 39.375, 41.25 , 43.125, 45. , 46.875, 48.75 , 50.625, 52.5 , 54.375, 56.25 , 58.125, 60. , 61.875, 63.75 , 65.625, 67.5 , 69.375, 71.25 , 73.125, 75. , 76.875, 78.75 , 80.625, 82.5 , 84.375, 86.25 , 88.125, 90. , 91.875, 93.75 , 95.625, 97.5 , 99.375, 101.25 , 103.125, 105. , 106.875, 108.75 , 110.625, 112.5 , 114.375, 116.25 , 118.125, 120. , 121.875, 123.75 , 125.625, 127.5 , 129.375, 131.25 , 133.125, 135. , 136.875, 138.75 , 140.625, 142.5 , 144.375, 146.25 , 148.125, 150. , 151.875, 153.75 , 155.625, 157.5 , 159.375, 161.25 , 163.125, 165. , 166.875, 168.75 , 170.625, 172.5 , 174.375, 176.25 , 178.125, 180. , 181.875, 183.75 , 185.625, 187.5 , 189.375, 191.25 , 193.125, 195. , 196.875, 198.75 , 200.625, 202.5 , 204.375, 206.25 , 208.125, 210. , 211.875, 213.75 , 215.625, 217.5 , 219.375, 221.25 , 223.125, 225. , 226.875, 228.75 , 230.625, 232.5 , 234.375, 236.25 , 238.125, 240. , 241.875, 243.75 , 245.625, 247.5 , 249.375, 251.25 , 253.125, 255. , 256.875, 258.75 , 260.625, 262.5 , 264.375, 266.25 , 268.125, 270. , 271.875, 273.75 , 275.625, 277.5 , 279.375, 281.25 , 283.125, 285. , 286.875, 288.75 , 290.625, 292.5 , 294.375, 296.25 , 298.125, 300. , 301.875, 303.75 , 305.625, 307.5 , 309.375, 311.25 , 313.125, 315. , 316.875, 318.75 , 320.625, 322.5 , 324.375, 326.25 , 328.125, 330. , 331.875, 333.75 , 335.625, 337.5 , 339.375, 341.25 , 343.125, 345. , 346.875, 348.75 , 350.625, 352.5 , 354.375, 356.25 , 358.125])
array(2.)
If we just wanted the mean we could do
tas_window.groupby('time.dayofyear').mean(['time', 'window'])
<xarray.DataArray 'tas' (dayofyear: 366, lat: 145, lon: 192)> dask.array<stack, shape=(366, 145, 192), dtype=float32, chunksize=(1, 145, 192), chunktype=numpy.ndarray> Coordinates: * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0 * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1 height float64 ... * dayofyear (dayofyear) int64 1 2 3 4 5 6 7 8 ... 360 361 362 363 364 365 366
|
array([-90. , -88.75, -87.5 , -86.25, -85. , -83.75, -82.5 , -81.25, -80. , -78.75, -77.5 , -76.25, -75. , -73.75, -72.5 , -71.25, -70. , -68.75, -67.5 , -66.25, -65. , -63.75, -62.5 , -61.25, -60. , -58.75, -57.5 , -56.25, -55. , -53.75, -52.5 , -51.25, -50. , -48.75, -47.5 , -46.25, -45. , -43.75, -42.5 , -41.25, -40. , -38.75, -37.5 , -36.25, -35. , -33.75, -32.5 , -31.25, -30. , -28.75, -27.5 , -26.25, -25. , -23.75, -22.5 , -21.25, -20. , -18.75, -17.5 , -16.25, -15. , -13.75, -12.5 , -11.25, -10. , -8.75, -7.5 , -6.25, -5. , -3.75, -2.5 , -1.25, 0. , 1.25, 2.5 , 3.75, 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. , 16.25, 17.5 , 18.75, 20. , 21.25, 22.5 , 23.75, 25. , 26.25, 27.5 , 28.75, 30. , 31.25, 32.5 , 33.75, 35. , 36.25, 37.5 , 38.75, 40. , 41.25, 42.5 , 43.75, 45. , 46.25, 47.5 , 48.75, 50. , 51.25, 52.5 , 53.75, 55. , 56.25, 57.5 , 58.75, 60. , 61.25, 62.5 , 63.75, 65. , 66.25, 67.5 , 68.75, 70. , 71.25, 72.5 , 73.75, 75. , 76.25, 77.5 , 78.75, 80. , 81.25, 82.5 , 83.75, 85. , 86.25, 87.5 , 88.75, 90. ])
array([ 0. , 1.875, 3.75 , 5.625, 7.5 , 9.375, 11.25 , 13.125, 15. , 16.875, 18.75 , 20.625, 22.5 , 24.375, 26.25 , 28.125, 30. , 31.875, 33.75 , 35.625, 37.5 , 39.375, 41.25 , 43.125, 45. , 46.875, 48.75 , 50.625, 52.5 , 54.375, 56.25 , 58.125, 60. , 61.875, 63.75 , 65.625, 67.5 , 69.375, 71.25 , 73.125, 75. , 76.875, 78.75 , 80.625, 82.5 , 84.375, 86.25 , 88.125, 90. , 91.875, 93.75 , 95.625, 97.5 , 99.375, 101.25 , 103.125, 105. , 106.875, 108.75 , 110.625, 112.5 , 114.375, 116.25 , 118.125, 120. , 121.875, 123.75 , 125.625, 127.5 , 129.375, 131.25 , 133.125, 135. , 136.875, 138.75 , 140.625, 142.5 , 144.375, 146.25 , 148.125, 150. , 151.875, 153.75 , 155.625, 157.5 , 159.375, 161.25 , 163.125, 165. , 166.875, 168.75 , 170.625, 172.5 , 174.375, 176.25 , 178.125, 180. , 181.875, 183.75 , 185.625, 187.5 , 189.375, 191.25 , 193.125, 195. , 196.875, 198.75 , 200.625, 202.5 , 204.375, 206.25 , 208.125, 210. , 211.875, 213.75 , 215.625, 217.5 , 219.375, 221.25 , 223.125, 225. , 226.875, 228.75 , 230.625, 232.5 , 234.375, 236.25 , 238.125, 240. , 241.875, 243.75 , 245.625, 247.5 , 249.375, 251.25 , 253.125, 255. , 256.875, 258.75 , 260.625, 262.5 , 264.375, 266.25 , 268.125, 270. , 271.875, 273.75 , 275.625, 277.5 , 279.375, 281.25 , 283.125, 285. , 286.875, 288.75 , 290.625, 292.5 , 294.375, 296.25 , 298.125, 300. , 301.875, 303.75 , 305.625, 307.5 , 309.375, 311.25 , 313.125, 315. , 316.875, 318.75 , 320.625, 322.5 , 324.375, 326.25 , 328.125, 330. , 331.875, 333.75 , 335.625, 337.5 , 339.375, 341.25 , 343.125, 345. , 346.875, 348.75 , 350.625, 352.5 , 354.375, 356.25 , 358.125])
array(2.)
array([ 1, 2, 3, ..., 364, 365, 366])
If we try the same with quantile (xarray's version of percentile) however we get an error
try:
tas_window.groupby('time.dayofyear').quantile(0.9, ['time', 'window'])
except Exception as e:
print('ERROR:', e)
ERROR: dimension time on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., ``.chunk(dict(time=-1))``, or pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` but beware that this may significantly increase memory usage.
Percentiles are hard to calculate on large distributed datasets - you need to have the entire dataset in memory, sort the data, then find the value at whatever percent along the sorted data. We need to help out Dask by manually loading the data we need for each calculation, so we'll go back to a loop and call '.load()' at each day of year, storing the result in a list.
%%time
result = []
for doy, data in tas_window.groupby('time.dayofyear'):
percentile = data.load().quantile(0.9, ['time', 'window'])
result.append(percentile)
CPU times: user 52min 1s, sys: 5min 31s, total: 57min 32s Wall time: 1h 18min 1s
Once that's finished the values in the list can be combined with xarray.concat
percentile_climatology = xarray.concat(result, dim='dayofyear')
percentile_climatology
<xarray.DataArray 'tas' (dayofyear: 366, lat: 145, lon: 192)> array([[[250.67286682, 250.67286682, 250.67286682, ..., 250.67286682, 250.67286682, 250.67286682], [251.57595825, 251.54759216, 251.49716187, ..., 251.66184998, 251.64952087, 251.61529541], [252.19511414, 252.02455139, 251.96104431, ..., 252.57232666, 252.41696167, 252.32958984], ..., [255.79222107, 255.88395691, 256.19177246, ..., 255.33990479, 255.59410095, 255.73428345], [254.41465759, 254.30558777, 254.33512878, ..., 254.36961365, 254.33735657, 254.32498169], [252.45547485, 252.45547485, 252.45547485, ..., 252.45547485, 252.45547485, 252.45547485]], [[250.6728241 , 250.6728241 , 250.6728241 , ..., 250.6728241 , 250.6728241 , 250.6728241 ], [251.59952393, 251.54962463, 251.53194733, ..., 251.68025208, 251.64960327, 251.61642151], [252.20888824, 252.06634064, 251.96010895, ..., 252.58323059, 252.44898682, 252.3343277 ], ... [255.82539368, 255.98866272, 256.29229736, ..., 255.67448425, 255.68865967, 255.80763245], [254.47499084, 254.69308472, 254.74490356, ..., 254.39610291, 254.50894165, 254.38670349], [252.46173096, 252.46173096, 252.46173096, ..., 252.46173096, 252.46173096, 252.46173096]], [[251.06691589, 251.06691589, 251.06691589, ..., 251.06691589, 251.06691589, 251.06691589], [251.23405151, 251.20076904, 251.19150391, ..., 251.38855896, 251.34030457, 251.29734802], [251.88744507, 251.86525269, 251.8250824 , ..., 252.6361908 , 252.57929688, 252.17391968], ..., [254.95679626, 255.5537323 , 255.38959045, ..., 254.92901306, 254.91663208, 254.88747559], [254.23662109, 254.21036072, 254.17580261, ..., 253.96891174, 254.10254822, 254.2250061 ], [252.87672119, 252.87672119, 252.87672119, ..., 252.87672119, 252.87672119, 252.87672119]]]) Coordinates: * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0 * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1 quantile float64 0.9 Dimensions without coordinates: dayofyear
array([[[250.67286682, 250.67286682, 250.67286682, ..., 250.67286682, 250.67286682, 250.67286682], [251.57595825, 251.54759216, 251.49716187, ..., 251.66184998, 251.64952087, 251.61529541], [252.19511414, 252.02455139, 251.96104431, ..., 252.57232666, 252.41696167, 252.32958984], ..., [255.79222107, 255.88395691, 256.19177246, ..., 255.33990479, 255.59410095, 255.73428345], [254.41465759, 254.30558777, 254.33512878, ..., 254.36961365, 254.33735657, 254.32498169], [252.45547485, 252.45547485, 252.45547485, ..., 252.45547485, 252.45547485, 252.45547485]], [[250.6728241 , 250.6728241 , 250.6728241 , ..., 250.6728241 , 250.6728241 , 250.6728241 ], [251.59952393, 251.54962463, 251.53194733, ..., 251.68025208, 251.64960327, 251.61642151], [252.20888824, 252.06634064, 251.96010895, ..., 252.58323059, 252.44898682, 252.3343277 ], ... [255.82539368, 255.98866272, 256.29229736, ..., 255.67448425, 255.68865967, 255.80763245], [254.47499084, 254.69308472, 254.74490356, ..., 254.39610291, 254.50894165, 254.38670349], [252.46173096, 252.46173096, 252.46173096, ..., 252.46173096, 252.46173096, 252.46173096]], [[251.06691589, 251.06691589, 251.06691589, ..., 251.06691589, 251.06691589, 251.06691589], [251.23405151, 251.20076904, 251.19150391, ..., 251.38855896, 251.34030457, 251.29734802], [251.88744507, 251.86525269, 251.8250824 , ..., 252.6361908 , 252.57929688, 252.17391968], ..., [254.95679626, 255.5537323 , 255.38959045, ..., 254.92901306, 254.91663208, 254.88747559], [254.23662109, 254.21036072, 254.17580261, ..., 253.96891174, 254.10254822, 254.2250061 ], [252.87672119, 252.87672119, 252.87672119, ..., 252.87672119, 252.87672119, 252.87672119]]])
array([-90. , -88.75, -87.5 , -86.25, -85. , -83.75, -82.5 , -81.25, -80. , -78.75, -77.5 , -76.25, -75. , -73.75, -72.5 , -71.25, -70. , -68.75, -67.5 , -66.25, -65. , -63.75, -62.5 , -61.25, -60. , -58.75, -57.5 , -56.25, -55. , -53.75, -52.5 , -51.25, -50. , -48.75, -47.5 , -46.25, -45. , -43.75, -42.5 , -41.25, -40. , -38.75, -37.5 , -36.25, -35. , -33.75, -32.5 , -31.25, -30. , -28.75, -27.5 , -26.25, -25. , -23.75, -22.5 , -21.25, -20. , -18.75, -17.5 , -16.25, -15. , -13.75, -12.5 , -11.25, -10. , -8.75, -7.5 , -6.25, -5. , -3.75, -2.5 , -1.25, 0. , 1.25, 2.5 , 3.75, 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. , 16.25, 17.5 , 18.75, 20. , 21.25, 22.5 , 23.75, 25. , 26.25, 27.5 , 28.75, 30. , 31.25, 32.5 , 33.75, 35. , 36.25, 37.5 , 38.75, 40. , 41.25, 42.5 , 43.75, 45. , 46.25, 47.5 , 48.75, 50. , 51.25, 52.5 , 53.75, 55. , 56.25, 57.5 , 58.75, 60. , 61.25, 62.5 , 63.75, 65. , 66.25, 67.5 , 68.75, 70. , 71.25, 72.5 , 73.75, 75. , 76.25, 77.5 , 78.75, 80. , 81.25, 82.5 , 83.75, 85. , 86.25, 87.5 , 88.75, 90. ])
array([ 0. , 1.875, 3.75 , 5.625, 7.5 , 9.375, 11.25 , 13.125, 15. , 16.875, 18.75 , 20.625, 22.5 , 24.375, 26.25 , 28.125, 30. , 31.875, 33.75 , 35.625, 37.5 , 39.375, 41.25 , 43.125, 45. , 46.875, 48.75 , 50.625, 52.5 , 54.375, 56.25 , 58.125, 60. , 61.875, 63.75 , 65.625, 67.5 , 69.375, 71.25 , 73.125, 75. , 76.875, 78.75 , 80.625, 82.5 , 84.375, 86.25 , 88.125, 90. , 91.875, 93.75 , 95.625, 97.5 , 99.375, 101.25 , 103.125, 105. , 106.875, 108.75 , 110.625, 112.5 , 114.375, 116.25 , 118.125, 120. , 121.875, 123.75 , 125.625, 127.5 , 129.375, 131.25 , 133.125, 135. , 136.875, 138.75 , 140.625, 142.5 , 144.375, 146.25 , 148.125, 150. , 151.875, 153.75 , 155.625, 157.5 , 159.375, 161.25 , 163.125, 165. , 166.875, 168.75 , 170.625, 172.5 , 174.375, 176.25 , 178.125, 180. , 181.875, 183.75 , 185.625, 187.5 , 189.375, 191.25 , 193.125, 195. , 196.875, 198.75 , 200.625, 202.5 , 204.375, 206.25 , 208.125, 210. , 211.875, 213.75 , 215.625, 217.5 , 219.375, 221.25 , 223.125, 225. , 226.875, 228.75 , 230.625, 232.5 , 234.375, 236.25 , 238.125, 240. , 241.875, 243.75 , 245.625, 247.5 , 249.375, 251.25 , 253.125, 255. , 256.875, 258.75 , 260.625, 262.5 , 264.375, 266.25 , 268.125, 270. , 271.875, 273.75 , 275.625, 277.5 , 279.375, 281.25 , 283.125, 285. , 286.875, 288.75 , 290.625, 292.5 , 294.375, 296.25 , 298.125, 300. , 301.875, 303.75 , 305.625, 307.5 , 309.375, 311.25 , 313.125, 315. , 316.875, 318.75 , 320.625, 322.5 , 324.375, 326.25 , 328.125, 330. , 331.875, 333.75 , 335.625, 337.5 , 339.375, 341.25 , 343.125, 345. , 346.875, 348.75 , 350.625, 352.5 , 354.375, 356.25 , 358.125])
array(0.9)