#!/usr/bin/env python # coding: utf-8 # # Times and Dates # Time is an essential component of nearly all geoscience data. Timescales span orders of magnitude from microseconds for lightning, hours for a supercell thunderstorm, days for a global weather model, millenia and beyond for the earth's climate. To properly analyze geoscience data, you must have a firm understanding of how to handle time in Python. In this notebook, we will examine the Python Standard Library for handling dates and times. We will also briefly make use of the [pytz](https://pypi.python.org/pypi/pytz) module to handle some thorny time zone issues in Python. # # ## `Time` Versus `Datetime` Modules and Some Core Concepts # # Python comes with [time](https://docs.python.org/3/library/time.html) and [datetime](https://docs.python.org/3/library/datetime.html) modules. Unfortunately, Python can be initially disorienting because of the heavily overlapping terminology concerning dates and times: # # - `datetime` **module** has a `datetime` **class** # - `datetime` **module** has a `time` **class** # - `datetime` **module** has a `date` **class** # - `time` **module** has a `time` function which returns (almost always) [Unix time](#What-is-Unix-Time?) # - `datetime` **class** has a `date` method which returns a `date` object # - `datetime` **class** has a `time` method which returns a `time` object # # This confusion can be partially alleviated by aliasing our imported modules: # In[1]: import datetime as dt # we can now reference the datetime module (alaised to 'dt') and datetime # object unambiguously pisecond = dt.datetime(2016, 3, 14, 15, 9, 26) print(pisecond) # In[2]: import time as tm now = tm.time() print(now) # ### `time` module # The `time` module is well-suited for measuring [Unix time](#What-is-Unix-Time?). For example, when you are calculating how long it takes a Python function to run (so-called "benchmarking"), you can employ the `time()` function from the `time` module to obtain Unix time before and after the function completes and take the difference of those two times. # # In[3]: import time as tm start = tm.time() tm.sleep(1) # The sleep function will stop the program for n seconds end = tm.time() diff = end - start print("The benchmark took {} seconds".format(diff)) # (For more accurate benchmarking, see the [timeit](https://docs.python.org/3/library/timeit.html) module.) # ### `datetime` module # # The `datetime` module handles time with the Gregorian calendar (the calendar we are all familiar with) and is independent of Unix time. The `datetime` module has an [object-oriented](#The-Thirty-Second-Introduction-to-Object-Oriented-Programming) approach with the `date`, `time`, `datetime`, `timedelta`, and `tzinfo` classes. # # - `date` class represents the day, month and year # - `time` class represents the time of day # - `datetime` class is a combination of the `date` and `time` classes # - `timedelta` class represents a time duration # - `tzinfo` (abstract) class represents time zones # # The `datetime` module is effective for: # # - performing date and time arithmetic and calculating time duration # - reading and writing date and time strings in a particular format # - handling time zones (with the help of third-party libraries) # # The `time` and `datetime` modules overlap in functionality, but in your geoscientific work, you will probably be using the `datetime` module more than the `time` module. # # ### What is Unix Time? # # Unix time is an example of system time which is the computer's notion of passing time. It is measured in seconds from the the start of the epoch which is January 1, 1970 00:00 [UTC](#What-is-UTC?). It is represented "under the hood" as a [floating point number](https://en.wikipedia.org/wiki/Floating_point) which is how computers represent real (ℝ) numbers . # # ### The Thirty Second Introduction to Object-Oriented Programming # # We have been talking about object-oriented (OO) programming by mentioning terms like "class", "object", and "method", but never really explaining what they mean. A class is a collection of related variables, similar to a [struct](https://en.wikipedia.org/wiki/Struct_(C_programming_language)), in the C programming language or even a tuple in Python) coupled with functions, or "methods" in OO parlance, that can act on those variables. An object is a concrete example of a class. # # For example, if you have a `Coord` class that represents an earth location with latitude, and longitude, you may have a method that returns the distance between two locations, `distancekm()` in this example. # In[4]: import math class Coord: """Earth location identified by (latitude, longitude) coordinates. distancekm -- distance between two points in kilometers """ def __init__(self, latitude=0.0, longitude=0.0): self.lat = latitude self.lon = longitude def distancekm(self, p): """Distance between two points in kilometers.""" DEGREES_TO_RADIANS = math.pi / 180.0 EARTH_RADIUS = 6373 # in KMs phi1 = (90.0 - self.lat) * DEGREES_TO_RADIANS phi2 = (90.0 - p.lat) * DEGREES_TO_RADIANS theta1 = self.lon * DEGREES_TO_RADIANS theta2 = p.lon * DEGREES_TO_RADIANS cos = (math.sin(phi1) * math.sin(phi2) * math.cos(theta1 - theta2) + math.cos(phi1) * math.cos(phi2)) arc = math.acos(cos) return arc * EARTH_RADIUS # To create a concrete example of a **class**, also known as an **object**, initialize the object with data: # In[5]: timbuktu = Coord(16.77, 3.00) # Here, `timbuktu` is an **object** of the **class** `Coord` initialized with a latitude of `16.77` and a longitude of `3.00`. # Next, we create two `Coord` objects: `ny` and `paris`. We will invoke the `distancekm()` method on the `ny` object and pass the `paris` object as an argument to determine the distance between New York and Paris in kilometers. # # In[6]: ny = Coord(40.71, 74.01) paris = Coord(48.86, 2.35) distance = ny.distancekm(paris) print("The distance from New York to Paris is {:.1f} kilometers.".format( distance)) # The old joke about OO programming is that they simply moved the struct that the function takes as an argument and put it first because it is special. So instead of having `distancekm(ny, paris)`, you have `ny.distancekm(paris)`. We have not talked about inheritance or polymorphism but that is OO in a nutshell. # ## Reading and Writing Dates and Times # # ### Parsing Lightning Data Timestamps with the `datetime.strptime` Method # # Suppose you want to analyze [US NLDN lightning data](https://ghrc.nsstc.nasa.gov/uso/ds_docs/vaiconus/vaiconus_dataset.html). Here is a sample row of data: # # 06/27/07 16:18:21.898 18.739 -88.184 0.0 kA 0 1.0 0.4 2.5 8 1.2 13 G # # Part of the task involves parsing the `06/27/07 16:18:21.898` time string into a `datetime` object. (The full description of the data are [described here](https://ghrc.nsstc.nasa.gov/uso/ds_docs/vaiconus/vaiconus_dataset.html#a6).) In order to parse this string or others that follow the same format, you will employ the [datetime.strptime()](https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime) method from the `datetime` module. This method takes two arguments: the first is the date time string you wish to parse, the second is the format which describes exactly how the date and time are arranged. [The full range of format options is described in the Python documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior). In reality, the format will take some degree of experimentation to get right. This is a situation where Python shines as you can quickly try out different solutions in the IPython interpreter. Beyond the official documentation, Google and [Stack Overflow](https://stackoverflow.com/) are your friends in this process. Eventually, after some trial and error, you will find the '%m/%d/%y %H:%M:%S.%f' format will properly parse the date and time. # In[7]: import datetime as dt strike_time = dt.datetime.strptime('06/27/07 16:18:21.898', '%m/%d/%y %H:%M:%S.%f') # print strike_time to see if we have properly parsed our time print(strike_time) # ### Retrieving METAR from the MesoWest API with Help from the `datetime.strftime` Method # Let's say you are interested in obtaining [METAR](https://en.wikipedia.org/wiki/METAR) data from the Aleutian Islands with the [MesoWest API](http://mesowest.org/api). In order to retrieve these data, you will have to assemble a URL that abides by the [MesoWest API reference](http://synopticlabs.org/api/mesonet/reference/), and specifically create [date time strings that the API understands](http://synopticlabs.org/api/mesonet/reference/) (e.g., `201606010000` for the year, month, date, hour and minute). For example, typing the following URL in a web browser will return a human-readable nested data structure called a [JSON object](https://en.wikipedia.org/wiki/JSON) which will contain the data along with additional "metadata" to help you interpret the data (e.g., units etc.). Here, we are asking for air temperature information from the METAR station at [Eareckson air force base](https://en.wikipedia.org/wiki/Eareckson_Air_Station) (ICAO identifier "PASY") in the Aleutians from June 1, 2016, 00:00 UTC to June 1, 2016, 06:00 UTC. # # [http://api.mesowest.net/v2/stations/timeseries?stid=pasy&start=201606010000&end=201606010600&vars=air_temp&token=demotoken](http://api.mesowest.net/v2/stations/timeseries?stid=pasy&start=201606010000&end=201606010600&vars=air_temp&token=demotoken) # { # "SUMMARY": { # "FUNCTION_USED": "time_data_parser", # "NUMBER_OF_OBJECTS": 1, # "TOTAL_DATA_TIME": "5.50103187561 ms", # "DATA_PARSING_TIME": "0.313997268677 ms", # "METADATA_RESPONSE_TIME": "97.2690582275 ms", # "RESPONSE_MESSAGE": "OK", # "RESPONSE_CODE": 1, # "DATA_QUERY_TIME": "5.18608093262 ms" # }, # "STATION": [ # { # "ID": "12638", # "TIMEZONE": "America/Adak", # "LATITUDE": "52.71667", # "OBSERVATIONS": { # "air_temp_set_1": [ # 8.3, # 8.0, # 8.3, # 8.0, # 7.8, # 7.8, # 7.0, # 7.2, # 7.2 # ], # "date_time": [ # "2016-06-01T00:56:00Z", # "2016-06-01T01:26:00Z", # "2016-06-01T01:56:00Z", # "2016-06-01T02:40:00Z", # "2016-06-01T02:56:00Z", # "2016-06-01T03:56:00Z", # "2016-06-01T04:45:00Z", # "2016-06-01T04:56:00Z", # "2016-06-01T05:56:00Z" # ] # }, # "STATE": "AK", # "LONGITUDE": "174.11667", # "SENSOR_VARIABLES": { # "air_temp": { # "air_temp_set_1": { # "end": "", # "start": "" # } # }, # "date_time": { # "date_time": {} # } # }, # "STID": "PASY", # "NAME": "Shemya, Eareckson AFB", # "ELEVATION": "98", # "PERIOD_OF_RECORD": { # "end": "", # "start": "" # }, # "MNET_ID": "1", # "STATUS": "ACTIVE" # } # ], # "UNITS": { # "air_temp": "Celsius" # } # } # // GET http://api.mesowest.net/v2/stations/timeseries?stid=pasy&start=201606010000&end=201606010600&vars=air_temp&token=demotoken # // HTTP/1.1 200 OK # // Content-Type: application/json # // Date: Mon, 27 Jun 2016 18:17:08 GMT # // Server: nginx/1.4.6 (Ubuntu) # // Vary: Accept-Encoding # // Content-Length: 944 # // Connection: keep-alive # // Request duration: 0.271790s # Continuing with this example, let's create a function that takes a station identifier, start and end time, a meteorological field and returns the JSON object as a Python dictionary data structure. We will draw upon our knowledge from the [Basic Input and Output notebook](https://github.com/Unidata/online-python-training/blob/master/notebooks/Basic%20Input%20and%20Output.ipynb) to properly construct the URL. In addition, we will employ the [urllib.request](https://docs.python.org/3/library/urllib.request.html) module for opening and reading URLs. # # But first, we must figure out how to properly format our date with the [datetime.strftime()](https://docs.python.org/2/library/datetime.html#datetime.date.strftime) method. This method takes a format identical to the one we employed for `strptime()`. After some trial and error from the IPython interpreter, we arrive at '%Y%m%d%H%M': # In[8]: import datetime as dt print(dt.datetime(2016, 6, 1, 0, 0).strftime('%Y%m%d%H%M')) # Armed with this knowledge of how to format the date and time according to the MesoWest API reference, we can write our `metar()` function: # In[9]: import urllib.request import json # json module to help us with the HTTP response def metar(icao, starttime, endtime, var): """ Retrieves METAR with the icao identifier, the starttime and endtime datetime objects and the var atmospheric field (e.g., "air_temp".) Returns a dictionary data structure that mirros the JSON object from returned from the MesoWest API. """ fmt = '%Y%m%d%H%M' st = starttime.strftime(fmt) et = endtime.strftime(fmt) url = "http://api.mesowest.net/v2/stations/timeseries?"\ "stid={}&start={}&end={}&vars={}&token=demotoken" reply = urllib.request.urlopen(url.format(icao, st, et, var)) return json.loads(reply.read().decode('utf8')) # We can now try out our new `metar` function to fetch some air temperature data. # In[10]: import datetime as dt pasy = metar("pasy", dt.datetime(2016, 6, 1, 0, 0), dt.datetime(2016, 6, 1, 6, 0), "air_temp") print(pasy) # The data are returned in a nested data structure composed of dictionaries and lists. We can pull that data structure apart to fetch our data. Also, observe that the times are returned in UTC according to the [ISO 8601 international time standard](https://en.wikipedia.org/wiki/ISO_8601). # In[11]: print(pasy['STATION'][0]['OBSERVATIONS']) # We could continue with this exercise by parsing the returned date time strings in to `datetime` objects, but we will leave that exercise to the reader. # ## Calculating Coastal Tides with the `timedelta` Class # # Let's suppose we are looking at coastal tide and current data perhaps in a [tropical cyclone storm surge scenario](http://www.nhc.noaa.gov/surge/). The [lunar day](http://oceanservice.noaa.gov/education/kits/tides/media/supp_tide05.html) is 24 hours, 50 minutes with two low tides and two high tides in that time duration. If we know the time of the current high tide, we can easily calculate the occurrence of the next low and high tides with the `timedelta` class. (In reality, the *exact time* of tides is influenced by local coastal effects, in addition to the laws of celestial mechanics, but we will ignore that fact for this exercise.) # # The `timedelta` class is initialized by supplying time duration usually supplied with [keyword arguments](https://docs.python.org/3/glossary.html#term-argument) to clearly express the length of time. Significantly, you can use the `timedelta` class with arithmetic operators (i.e., `+`, `-`, `*`, `/`) to obtain new dates and times as the next code sample illustrates. This convenient language feature is known as [operator overloading](https://en.wikipedia.org/wiki/Operator_overloading) and again illustrates Python's batteries-included philosophy of making life easier for the programmer. (In another language such as Java, you would have to call a method significantly obfuscating the code.) Another great feature is that the difference of two times will yield a `datetime` object. Let's examine all these features in the following code block. # In[12]: import datetime as dt high_tide = dt.datetime(2016, 6, 1, 4, 38, 0) lunar_day = dt.timedelta(hours=24, minutes=50) tide_duration = lunar_day / 4 next_low_tide = high_tide + tide_duration next_high_tide = high_tide + (2 * tide_duration) tide_length = next_high_tide - high_tide print("The time between high and low tide is {}.".format(tide_duration)) print("The current high tide is {}.".format(high_tide)) print("The next low tide is {}.".format(next_low_tide)) print("The next high tide {}.".format(next_high_tide)) print("The tide length is {}.".format(tide_length)) print("The type of the 'tide_length' variable is {}.".format(type( tide_length))) # In the last `print` statement, we use the [type()](https://docs.python.org/3/library/functions.html#type) built-in Python function to simply illustrate the difference between two times yields a `timedelta` object. # ## Dealing with Time Zones # # Time zones can be a source of confusion and frustration in geoscientific data and in computer programming in general. Core date and time libraries in various programming languages inevitably have design flaws (Python is no different) leading to third-party libraries that attempt to fix the core library limitations. To avoid these issues, it is best to handle data in UTC, or at the very least operate in a consistent time zone, but that is not always possible. Users will expect their tornado alerts in local time. # # ### What is UTC? # # [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) is an abbreviation of Coordinated Universal Time and is equivalent to Greenwich Mean Time (GMT), in practice. (Greenwich at 0 degrees longitude, is a district of London, England.) In geoscientific data, times are often in UTC though you should always verify this assumption is actually true! # # ### Time Zone Naive Versus Time Zone Aware `datetime` Objects # # When you create `datetime` objects in Python, they are so-called "naive" which means they are time zone unaware. In many situations, you can happily go forward without this detail getting in the way of your work. As the [Python documentation states](https://docs.python.org/3/library/datetime.html): "Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality". However, if you wish to convey time zone information, you will have to make your `datetime` objects time zone aware. In order to handle time zones in Python, you will need the third-party [pytz](https://pypi.python.org/pypi/pytz) module whose classes build upon, or "inherit" in OO terminology, from the `tzinfo` class. You cannot solely rely on the Python Standard Library unfortunately. Here, we create time zone naive and time zone aware `datetime` objects: # In[13]: import datetime as dt import pytz naive = dt.datetime.now() aware = dt.datetime.now(pytz.timezone('US/Mountain')) print("I am time zone naive {}.".format(naive)) print("I am time zone aware {}.".format(aware)) # The `pytz.timezone()` method takes a time zone string and returns a `tzinfo` object which can be used to initialize the time zone. The `-06:00` denotes we are operating in a time zone six hours behind UTC. # # ### Print Time with a Different Time Zone # # If you have data that are in UTC, and wish to convert them to another time zone, Mountain Time Zone for example, you will again make use of the `pytz` module. First, we will create a UTC time with the [utcnow()](https://docs.python.org/3/library/datetime.html#datetime.datetime.utcnow) method which inexplicably returns a time zone naive object so you must still specify the UTC time zone with the [replace()](https://docs.python.org/3/library/datetime.html#datetime.datetime.replace) method. We then create a "US/Mountain" `tzinfo` object as before, but this time we will use the [astimzone()](https://docs.python.org/3/library/datetime.html#datetime.datetime.astimezone) method to adjust the time to the specified time zone. # In[14]: import datetime as dt import pytz utc = dt.datetime.utcnow().replace(tzinfo=pytz.utc) print("The UTC time is {}.".format(utc.strftime('%B %d, %Y, %-I:%M%p'))) mountaintz = pytz.timezone("US/Mountain") ny = utc.astimezone(mountaintz) print("The 'US/Mountain' time is {}.".format(ny.strftime( '%B %d, %Y, %-I:%M%p'))) # We also draw upon our earlier knowledge of the `strftime()` method to format a human-friendly date and time string.