Have 15 minutes? Work through some of these little lessons. Finish them up? Write some of your own. You don't have to share them with anyone (although you could): Creating your own problems to solve is easily one of the most effective ways to gain confidence in approaching data science through computation.
Some of this will be new, some of it familiar. Work on the parts that are most interesting to you, and trust that the others will become interesting in time.
OK: Let's review a bit as we move into more new stuff. Let's start by counting from 0 to 9.
#Here's one way:
a = 0
print(a)
a = 1
print(a)
a = 2
print(a)
a = 3
print(a)
0 1 2 3
# Here's a second way:
a = 0
print(a)
a = a + 1
print (a)
a = a + 1
print (a)
a = a + 1
print (a)
0 1 2 3
# Not much better. Here's a third way:
for n in range(0,10):
print(n)
0 1 2 3 4 5 6 7 8 9
The n
in the statement for n in range(0,10):
works as our counter: It is going to increase ('increment') by 1 every time this line is repeated. (It turns out that this basic act -- adding 1 to a variable -- is one of the most important features of most programming languages).
In our statement, the range()
function defines a range of numbers. In this case, a range with a low of 0 and an upper limit of 10.
So one other thing we ought to make clear then: Look back at the output from the for
loop: Do you notice that we didn't count from one to ten? Instead, we counted from 0 to 9. That seems... weird, right?
Right: It started at 0, like you'd expect, but then it ended at 9. Why is that? Didn't we say "10" in our range? Well, yes. Sigh. But that's the way things work: range(0,10)
means "from 0 and up to but not including 10."
# Self-evidently, perhaps:
for n in range(2,8):
print(n)
# Again: We use 8 as the upper-most bound,
# but we never actually get to that number.
2 3 4 5 6 7
Fine. But hold on a moment, cowpoke. The upper- and lower-bound part makes sense. But there's always room for another argument: Let's suppose we add another number to that range of yours? What do you suppose this will do?
for n in range(1,10,3):
print(n)
It turns out that the range() function allows UP TO 3 arguments: range (from, up to, step)
From is the starting number, up to tells us the upper bounds (which we'll never reach), and step describes the number by which we will increment along the way. By default, step is assumed to be 1. That's why we often don't include it in our code: We are lazy, and the computer fills it in for us. In fact, the same thing goes for the first argument, from
: We almost always just assume from
is going to be a zero.
REMEMBER: You begin to count with the number "1", but computers almost always start counting with the number "0". Why? Because they are sneaky and they want to finish counting before we do.
Let's take a look at some examples of how these default values work:
This loop, for instance:
for n in range(0,5,1):
print(n)
Is exactly the same as this:
for n in range(0,5):
print(n)
Is exactly the same as this:
for n in range(5):
print(n)
for n in range(5):
print(n)
0 1 2 3 4
Fine. From
, Up to
, and Step
are always there, even if they aren't always there. But I can include them and use less conventional values to do cool things.
Let's say I've built some kind of super-villain compound and I am going to launch the missiles (... or I'm counting down the final seconds of the end of the year... or I'm holding my breath in an underwater-breath-holding-Olympic-contest, whatever, I don't know, use your imagination):
My point: If a step of 1 went up, shouldn't a step
of -1 go... down?
# Gateau sec: We set the lower bound as the upper,
# set the upper bound as the lower,
# and set the increment to -1 so that it decrements:
for n in range(5,0,-1):
print(n)
5 4 3 2 1
# And remember: Variables are every bit as meaningful as digits.
# Below, I've used variables in place of digits. Variables are
# generally more flexible: They provide relative values, where
# digits only provide absolute ones.
low = 2
high = 11
step = 4
for n in range(low,high,step):
print(n)
# See how it stops before it hits the high bound, no matter how close it was?
2 6 10
# What's more, bear this in mind: The n that we're getting out of
# that counter function will resolve to a numeric value:
# We can plug it right back into some other expression.
# In this case, I've done just that, and I've built
# something called a "nested loop".
# Think of nested-loops as being like Russian nesting dolls
# -- one doll inside another doll. Here, one loop
# runs inside another loop.
# cleaning staff schedule:
# example 'nested loop' code
groundfloor = 1
penthouse = 4
room_low = 10
room_high = 15
for floor in range(groundfloor, penthouse):
for suite in range(room_low, room_high):
assignment = str(floor) + str(suite)
print("Please make ready room: ", assignment)
print("Please also clean Floor", penthouse)
Please make ready room: 110 Please make ready room: 111 Please make ready room: 112 Please make ready room: 113 Please make ready room: 114 Please make ready room: 210 Please make ready room: 211 Please make ready room: 212 Please make ready room: 213 Please make ready room: 214 Please make ready room: 310 Please make ready room: 311 Please make ready room: 312 Please make ready room: 313 Please make ready room: 314 Please also clean Floor 4
See how well that worked?
Well, worked well with one possible exception: If our Penthouse Suite is on level 4, then our cleaning bots will never get there: By assigning a range(1,4), we guaranteed that they never go up past 3. So I just added a final print statement to clean up our loose ends.
It's worth pointing out that the never-reach-the-topmost-number thing seems weird -- but on the other side of the number line, there is more weirdness: We never count from 1, but always start at 0. Both of those standards cause a lot of confusion -- but in the end, they tend to balance one another out. So don't overthink this: It often takes care of itself.
# In any event, the building metaphor is still a good way to
# think about 'nested loops'. If you're going to go through
# every room in a building, you're probably not going to do
# it all willy-nilly and randomly: You'll probably do one
# floor at a time, and while you're on that floor, you
# visit each room in turn.
# And THAT is a nested loop:
# Go to a floor. Go through each room. Go to the next
# floor. Go through each room. Etc.
Note that as before, we can run things backwards this time, high to low, or even add steps to skip through some rooms or some floors. Let's start from the top this time, and just assign our cleaning-bots the even-numbered rooms.
# cleaning staff schedule 2:
# even rooms, top to bottom
basement = 0
penthouse = 3
room_low = 10
room_high = 16
# now add a step of -1
# Outermost Loop (Floor by floor loop)
for floor in range(penthouse, basement, -1):
# Innermost Loop (Room by room loop)
# add a step of positive 2 (for even numbers)
for suite in range(room_low, room_high, 2):
assignment = str(floor) + str(suite)
print("Please prepare room: ", assignment)
Please prepare room: 310 Please prepare room: 312 Please prepare room: 314 Please prepare room: 210 Please prepare room: 212 Please prepare room: 214 Please prepare room: 110 Please prepare room: 112 Please prepare room: 114
low = 2
high = 15
step = 3
# Outermost loop
for n in range(low,high,step):
print('') #<-- print('nothing')
# Inner loop
for c in range(0, n):
print('*', end='')
** ***** ******** *********** **************
OK, let me take a moment and provide some explanation for how we used those print()
statements, above. (Although you shouldn't hesitate to start by making small changes to the code above and see what happens for yourself: Always a better way to learn than to hear me explain it).
By default, a print()
function prints your variable, and then it adds a single additional character: a LineFeed -- in effect, like I've struck return or enter. (Truth be told, when we write it out, it looks like two characters (\n
)... and I suppose when we say it outloud, it sounds like three syllables ("escape-N"). But it is actually just one 8-bit character to the computer.
So to make the lines above, I need to tell Python to knock it off with the LineFeeds. I do that (in a surprisingly clunky way) by just appending (adding) a second argument to my print statement. I say "Print this", and then I add "Oh, and by the way, you should end a line by printing nothing.)" But I could say "Oh, and could you add a "," and a space at the end, please?" Or whatever. The point being I ask it to print one thing instead of a LineFeed (\n
).
Here's what happens when I take that approach an apply it to the code above:
low = 2
high = 15
step = 3
for n in range(low,high,step):
for c in range(0, n):
print('*', end='')
****************************************
Wait, you'll say. That's not what we was promised! 'Tis true, says I. Let's change the code around a bit and see if we can figure out why. (And I'm changing some of the numbers a bit in order to make it easier to follow).
low = 3
high = 8
step = 1
for n in range(low,high,step):
for c in range(0, n):
print(c, end='')
0120123012340123450123456
# Ah-ha! Let me break the line above apart:
# 012 <-- n is 3
# 0123 <-- n is 4
# 01234 <-- n is 5, etc.
# The problem is that after each cycle, it wasn't adding a return:
# That last print statement, on line 7, told it never to do so. end=''
# is the same as "Don't Print a Return".
# BUT if we sneak a blank line INSIDE the repetition of the
# first loop but OUTSIDE the repetition of the second loop,
# then we could separate the lines again. To wit:
low = 3
high = 8
step = 1
for n in range(low,high,step):
print('') # <-- prints nothing, but will add a <return>
# after each new group of digits prints.
# Note also that it is part of the first cycle (for n)
# but it is outside the range of the second cycle (for c)
for c in range(0, n):
print(c, end='')
012 0123 01234 012345 0123456
# We could also do something like this:
low = 3
high = 8
step = 1
for n in range(low,high,step):
print(n, end = ': ')
for c in range(0, n):
print(c, end='')
print('')
3: 012 4: 0123 5: 01234 6: 012345 7: 0123456
In this case, there are actually three lines that contribute to put information to the screen. The first
print(n, end=': ')
only gets fired (gets executed, gets run) immediately before the second loop; it only runs 5 times. I set that to print the top bound (n
), followed by a colon and a space -- and because I set that as the end
, Python won't add in its normal LineFeed
code.
In the second loop, there is a print command that is called (fired, run, executed) much more frequently. But it does as little as possible:
print(c, end='')
It only prints the value of c
and moves on. It NEVER prints a LineFeed
, even when it has finished.
The last print
statement is trickier: Because of its placement, it is part of the n-loop, but not part of the c-loop: That is, it is part of the outermost loop, but not part of the innermost loop. If it were intended to be part of the c-loop, it would've been indented one more column to the right. It would've looked like this:
for c in range(0, n):
print(c, end='')
print('')
Instead it looked like this:
for c in range(0, n):
print(c, end='')
print('')
(REMEMBER: Python, unlike most languages, is very picky about indenting: Indentation is part of how it interprets code.) As it was written, the indentation tells Python that the last print
statement is part of the n-loop -- that is, it is "inside" the n-loop -- so that it will get executed once AFTER the c-loop is done running.
And for all that -- what does it do? It seems to print nothing -- but remember that by default, it is adding a Line Feed. Thus, it pushes everything that follows one line down.
Here's another illustration of how the lines above would fire:
5: 6, 78, 78, 78, 9 6, 78, 78, 78, 78, 9 6, 78, 78, 78, 78, 78, 9 6, 78, 78, 78, 78, 78, 78, 9