The for statement

Consider the scenario where we are examining wind observations related to a site being considered for wind energy production. The wind data consist of a time series of tuples containing the time of the observation and the wind gusts in meters per second. Wind turbines can handle winds up to 40-72 meters per second before measures have to be taken to limit their rotation. The goal here is to find the times at which the observations exceeded 35 meters per second.

Consider the following wind gust data. Values of -1 are missing observations. UTC times are represented as strings.

In [1]:
data = [('2016-12-27T19:02:00Z', 19.03), ('2016-12-27T19:22:00Z', -1),
        ('2016-12-27T19:42:00Z', -1), ('2016-12-27T20:02:00Z', -1),
        ('2016-12-27T20:22:00Z', -1), ('2016-12-27T21:02:00Z', 19.03),
        ('2016-12-27T21:22:00Z', -1), ('2016-12-27T22:02:00Z', 28.29),
        ('2016-12-27T22:22:00Z', -1), ('2016-12-27T23:02:00Z', 34.98),
        ('2016-12-27T23:22:00Z', 35.5), ('2016-12-28T00:01:00Z', -1),
        ('2016-12-28T00:21:00Z', 33.44), ('2016-12-28T01:02:00Z', -1),
        ('2016-12-28T01:22:00Z', 36.01), ('2016-12-28T02:01:00Z', 37.55),
        ('2016-12-28T02:22:00Z', 44.76), ('2016-12-28T03:02:00Z', 38.58),
        ('2016-12-28T03:22:00Z', 36.53), ('2016-12-28T04:02:00Z', 26.75),
        ('2016-12-28T04:22:00Z', 23.15), ('2016-12-28T05:02:00Z', 24.18),
        ('2016-12-28T05:22:00Z', 22.12), ('2016-12-28T06:02:00Z', 27.78),
        ('2016-12-28T06:22:00Z', 27.27), ('2016-12-28T07:02:00Z', 28.29)]

We will find observations greater than 35 meters per second and append them to a gusts list. There are a few ways we could achieve this objective with Python. We could employ a while statement or the range function to enumerate incrementally through the list index to retrieve the value at a particular list index. This style of programming is familiar to Fortran or C family languages with the inevitable for loops (e.g., for(int i = 0; i < n; i++)). But often, keeping track of the list index is completely unnecessary. We are simply proceeding through a list, grabbing values of interest. In Python, when trying to process a list, instinctively gravitate towards using a for loop statement:

for iterating_var in sequence:
   statements(s) # usually involving the iterating_var variable

Let's make use of the Python for statement for our wind gusts:

In [2]:
gusts = []
for time, obs in data:
    if obs > 35:
        gusts.append((time, obs))
[('2016-12-27T23:22:00Z', 35.5),
 ('2016-12-28T01:22:00Z', 36.01),
 ('2016-12-28T02:01:00Z', 37.55),
 ('2016-12-28T02:22:00Z', 44.76),
 ('2016-12-28T03:02:00Z', 38.58),
 ('2016-12-28T03:22:00Z', 36.53)]

Take a closer look at the for time, obs in data statement. We could have written a statement such as for observation in data to retrieve each observation tuple in the series and subsequently pull apart the time and obs using tuple operations. Python, however, allows us to pull the tuple apart right in the for statement, yielding a concise way to obtain the observation. After that, we test the wind gust speed and append it to the gusts list if the wind speed exceeds the threshold.

Another even more concise approach we could take is list comprehension which we cover in the Functions notebook and in an example in the Basic Input and Output notebook. Indeed Guido would have preferred:

In [3]:
[(time, obs) for time, obs in data if obs > 35]
[('2016-12-27T23:22:00Z', 35.5),
 ('2016-12-28T01:22:00Z', 36.01),
 ('2016-12-28T02:01:00Z', 37.55),
 ('2016-12-28T02:22:00Z', 44.76),
 ('2016-12-28T03:02:00Z', 38.58),
 ('2016-12-28T03:22:00Z', 36.53)]

In sum, you will be employing loops extensively and you will benefit by making use of Python for loops to process sequences.