This lecture is based on your readings for the week: Tracy Osborn's Really Friendly Command Line Intro; Software Carpentry's The Unix Shell; and the optional essay by Neal Stephenson, In the Beginning was the Command Line. For your reference:
Here's a short description detailing the different terms that people use for the command line and different versions of it. If you're confused by some of the verbiage or would like a bit more depth on the different types of command lines people use, check it out here. If you're a bit shakey on some of the terminology, it might be information overload for you, so don't feel like it's mandatory to read.
There are a few in-class short exercises that deal with what you'll be learning today, to be completed in small groups of two or three. There is not, however, a lab exercise notebook for you to complete over the course of the week, so you have a bit of a break from lab work this week. Instead, make a GitHub account, get the URL to your profile, and submit it to Canvas.
Look to the left of this notebook. If you haven't minimized it, it's going to be a list of files and folders. They're all things that you've created or uploaded to your personal folder on the supercomputer.
On your personal computer, you've got a similar thing going: files organized into folders. For me on Mac, it looks a little something like this:
If you use Windows, it'll probably look like this:
Operating systems come with programs like Finder and Windows Explorer to help us navigate, organize, and view all of our files, folders, and programs. But, this is just a simplification of the truth. In reality, your files are, quite literally, an array of ones and zeros on your hard drive. Calling them "files" and organizing them into "folders" is just a metaphor.
The legendary computer scientist Donald Knuth said, in an interview, the following:
The psychological profiling [of a computer scientist] is mostly the ability to shift levels of abstraction, from low level to high level. To see something in the small and to see something in the large.
Abstraction is the key word there; the idea that almost everything we think is simple actually has a lot more going on underneath the surface is of vital importance to computer science. Abstraction is basically the idea that sometimes it's beneficial to ignore a bunch of details when describing some part of how a computer or program works.
Take this little Python program as an example:
def add_two(num1, num2):
num1 = int(num1)
num2 = int(num2)
sum = num1 + num2
return sum
num1 = input("Enter one number: ")
num2 = input("Enter a second: ")
print("Your sum is", add_two(num1, num2))
How would you describe what your program does to someone who's never programmed?
Would you tell them about variables? What a function is? Converting from a string to an integer? Of course not.
What you're doing is abstraction: glossing over details to serve a functional purpose.
Abstraction is just one part of computational thinking that we discussed during lecture. As a reminder the four ideas are:
How do the other ideas of computational thinking play into what we've learned so far about Python and files?
How much of the computational thinking does the programmer have to do versus how much the computer has to do?
"Men are not disturbed by things, but the view they take of things."
— Epictetus (55-135 A.D.)
"What about things like bullets?"
— Behavioralist Herb Kimmel, upon hearing the above quote in 1981 (source)
It was said earlier that files and folders are just your computer using a metaphor to talk about data. Through that metaphor, your computer is abstracting the details of what files actually are from you. When you look at your files with a file browser like Finder, Windows Explorer, or the file viewer on JupyterHub, what you're looking at is an abstraction your computer tells you to make organizing your documents easier.
What file browsers are actually looking at is something called the file system. Like the name indicates, a file system (sometimes spelled as one word, filesystem) is a system for managing files. It's basically how your computer thinks about the data that's on your disk. It organizes it into structures to make your files easy to find, quick to access, and simple to change.
These layers of abstraction basically look like this:
Now, open up a terminal in Jupyter and put it side by side to this notebook. What do you see?
It's pretty austere, but you can do a lot with it.
Before we dive into the command line, open up the Jupyter file browser and take a look at your home folder. Keep the filenames and folders in mind for this next part. Now, go back to the terminal window, type in ls
, and hit Enter. (Don't worry about what ls
stands for right now, we'll get to that very soon.)
Notice how the things that the terminal printed out are the same as what's in your file browser? That's because the command line is just another way of looking at the files on a computer. File browsers and the command line are different ways of viewing your filesystem.
In the Really Friendly Introduction to the Command Line you read, you saw the command line being referred to as the Unix shell. Unix describes computer operating systems like Linux and Mac, but not Windows. Because Pitt's supercomputer is a Unix computer, the shell we're using is a Unix shell. You can read more about this terminology in the addendum for today's lecture here.
Unix shells all basically follow the same format: they list the username, computer's name, and the $, which can stand for "shell". Terminal on Mac, which is also a Unix shell, looks like this:
Let's take a break from the command line for just a minute to do some journaling. (No, seriously.)
Open up a blank text file in Jupyter and write what you're thinking right now. Just a few sentences, and then rename the file to be whatever name you want. Make sure you save it in your main directory!
Now, let's try ls
again. In case you haven't guessed, ls
lists the files in whatever directory you're in.
Good question! If you ever need to know the full address that you're working in on the command line, use the pwd
command. pwd
stands for print working directory. It'll tell you what folder you're currently looking at! (Directory is another word for folder on computers.)
So, when you pwd
(on the SCI Jupyterhub), you'll get something that looks like this:
That's your working directory on the supercomputer. home
is the name of the folder that contains all the users and jupyter-abc123
is the name of your personal folder!
On your home computer, you should see something that looks like this:
That's your working directory on your own computer. c
is the name of the drive the files are located on, username
is whatever username you're using on the computer, etc, etc.
When you click on the little home icon in the file browser on Jupyter, it'll take you to that personal main folder. That's where you should save your journal, if you didn't already.
cat
command¶No, not that kind of cat. (Sad!) The shell command cat
, which is short for concatenate, will output the contents of any file to the terminal window.
You're going to use cat
to read the journal you wrote.
First, do ls
. Do you see the name of the text file you wrote your journal in? If you don't, make sure you saved your file in your main folder.
Now, it's time for cat
. Type cat filename.txt
(where filename is what you named your file. If you included spaces in your filename, put it in quotations. Additionally, make sure you get the case correctly.
Now, hit Enter. You should see your journal entry, printed out for the world to see. (Or, at least, for you.)
The cd
command allows you to change what directory you're currently looking at. cd
, as you may have guessed, stands for "change directory".
That last folder, the one we're currently in, is has the same name as our Pitt username.
Before we go navigating around willy-nilly, let's introduce the concept of autocompletion to you. If you don't have any folders in your main folder on JupyterHub, take the time now to create one.
The mkdir
command, which is short for "make directory", lets you, um, make a directory. Try doing mkdir "hello world"
(remember, the command line cares about spaces).
Then, type in cd h
and press the Tab key.
Whoa, what happened? Autocomplete. The command line can provide guesses on what it thinks you're going to type, which can save you a lot of typing. Not all commands have autocomplete (in fact, cd
really is the big one), but when they do, it's helpful to keep in mind.
Let's cd
into the folder we just created. Do pwd
once you're in there, just to see what it looks like, and try ls
. (There's nothing in the folder. Big surprise.)
Okay, so how do we get out? Type cd ..
: just like that, cd and two periods, nothing else. Press Enter. Now, pwd
to find out where the heck you are.
There are a few shortcuts that the command line provides for you that help navigating and managing your files and directories. They are as follows:
shortcut | description |
---|---|
. |
the current directory |
.. |
one directory "above" the current one |
~ |
your main folder |
You'll use those a lot for navigating and moving files.
Try cd ..
once more. Now where are you?
Use cd ~
to come back home, and then navigate to this week's repo. I'm sure you noticed, in this repo, a file with the extension .py
. .py
files are Python programs, like a code cell in a Jupyter notebook.
You can run Python programs on the command line using the python
command. Go ahead and run the file whoareyou.py
in this week's repo by typing python whoareyou.py
.
bash
commands vs. installed commands¶If you installed Anaconda on your Mac (which uses the same type of command line that we're using now, called bash
) and you try to run a Python file from the command line, it should (theoretically) work, because we've installed python. However, before the semester, there's a decent chance it wouldn't work. You would have gotten something like this:
That's because the python
command isn't installed automatically on all computers that have the bash
command line; it's not native to bash
. You're able to run python
because python is already installed on the SCI JupyterHub, or you took the time to install Anaconda on your Mac.
But, you never installed a program called ls
, or cat
, or cd
. That's because those commands are native to bash
. They're included automatically.
There are literally hundreds of native bash
commands, for everything from displaying a calendar (cal
, unsurprisingly) to printing out the last 10 lines of a file (tail
).
There's a list of the default bash
commands here: https://ss64.com/bash/
Below are a list of shell commands that we've seen so far and a description of what we're hoping to accomplish with each one. The issue, is that the person who wrote them is not very good at the command line, so they have made a fatal mistake in each one. Your task is to fix the mistake in each of the commands. Good luck!
cat tax documents.txt
print currdir
mkdir foo
mkdir bar
cd ..
cat
From the main repo, you should see the directory testfiles
. cd
to it, and look around a little bit.
Copy the testfiles
directory, with all of the files in it, into your home folders. How are we going to do that?
mv
command allows you to move files and folders around. It uses the format mv source destination
, so, if you wanted to move a file called test.txt to a folder called myfiles, you would type mv test.txt myfiles
. But, mv
's not what we want to do here.cp
command has the exact same format as mv
, but instead of moving, it'll make a copy wherever you specify.mv
and cp
are lazy. They'll only look at any files in your current directory. If you want to copy all of the files in a folder, you have to use the flag -r
. That stands for "recursive", which means that it'll copy again once you're inside a directory until you reach the end of the "tree".So, after you've copied the test files to your home directory, open and refresh the file browser on Jupyter. See the "testfiles" directory there? You're welcome to poke around in it to make sure all the files are still there.
Now, what you need to do is delete every file that's from 2002. How would you do that using a normal, GUI file browser?
Doing these kinds of repetitive tasks using the command line is much, much easier, thanks to the magic of wildcards. Wildcards allow you to match any text of any length. There are two main wildcards:
*
matches any text of any length. For example, if you wanted to match every Python file, the wildcard for that would be *.py
, etc.?
matches any single character. You can string them together as much as you want. For example, text files with five-character names which begin with "a" would be matched with a????.txt
. If you wanted to search for files with the years in the 1900s, you could use the 19??
wildcard.Question: how would you match a text file that begins with a person's name and ends with a number between 100 and 199?
You can delete files with the rm
command. This can also delete directories if you use the same -r
flag you did when you copied the test files over.
So, now remove all of the files in the testfiles
directory which contains "2002". How do you think we're going to do that?
You probably have some questions about what we've done so far, like:
Well, here's your answer:
fiddly (adjective, British)
complicated or detailed and awkward to do or use.
You've experienced some fiddliness thus far in this class, and those of you that have done software development in other classes or on your own have certainly dealt with it, but the command line takes that to another level.
Fiddliness is a byproduct of the Unix aesthetic which, at its core, is a dislike of extra stuff: cutesy descriptions, unnecessary information, using an entire word when an abbreviation will do.
But it's worth getting through the fiddliness for one reason: the command line is incredibly powerful.
Below are two things: a description, in plain English, of a task we want to accomplish and a list of the relevant shell commands. The shell commands, however, are not in their proper order, and executing them as they are would wreak havoc on any system. Your task is to put the commands in order so we can do what we properly want to do. Good luck!
WE want to:
The commands are as follows, out of order:
cp -r ~/apples .
cd ~
mv *.txt apples
cd oranges
mkdir apples oranges
Up to this point, you've only ever looked at files that either you created in Jupyter or ones that were provided for you. You can do text editing directly from the command line, without ever opening another window. A lot of software developers prefer using the command line to write code, for its power, customizability, and the ease of running and testing programs as you write them.
There are three main text editors for the command line: nano
, vim
, and emacs
.
nano
: Also called pico
, this is the simplest and easiest to use text editor. It's pretty close to something like Notepad, but without the ability to use your mouse. This is definitely the one that's best for teaching basic text editing.vim
: One of the two main text editors that programmers like to use. vim
is noted for its simplicity, which borders on impossibility to use. vim
allows the user to switch between "modes", which allow you to manipulate, select, and insert text.emacs
: The other major programming text editor, emacs
is notorious for customizability and an incredible amount of features. There is a Twitter client for emacs
. emacs
allows the user to create various "windows" inside the text editor for editing things side-by-side, running code while editing it, and even has its own programming language (emacs-lisp) for creating new features for it.ed
, which all have their strengths and uses. We won't really get into these ones.Those bullet points can basically be summed up with this cartoon from xkcd:
nano
!¶Most major text editors allow you to put the name of a file as the argument for the shell command. For example, to open the file "test.txt" in nano
, you'd simply type nano test.txt
. This works just as well for files that don't exist (yet). You can also simply type nano
to open up a blank file.
Making sure you're in your main folder, let's write another journal, like you did earlier.
nano
or nano filename.txt
, where "filename.txt" is whatever you want to name your journal. If you don't enter a filename now, you'll put one in later.nano
, you will immediately be able to type in or edit text. If you opened a new or blank file, your screen will be pretty much empty. If your file wasn't empty, you'll see its contents!nano
to save your files and quit. The shortcuts you can use are listed at the bottom of the screen. Most shortcuts that you will use start with the Control key, denoted by the caret sign ^
. To save your file, hit Control and the letter O at the same time (^O
). nano
will then ask you what you want the filename to be: if you typed one in earlier, it will show up here and you can just hit enter to confirm; if you haven't typed one in yet, you'll have to type one in now before hitting enter.nano
, you use the "quit" command, which is ^X
. You're done! That was painless.So, what happened to your journal? Go back to the Jupyter file browser and refresh it. When you see your second journal file, open it up in Jupyter. There's your journal, safe and sound! Congratulations, you've just successfully written text on the command line. Welcome to the club; we've got some famous members.
So, the only files we've looked at, really, have been Python code and text files, which are both pretty boring, if we're being honest. You can use cat
to see what any file looks like "under the hood", so to speak. You can check out source code and see how things are arranged and stored in other files.
With that in mind, let's answer a question that must be burning in everyone's mind: what actually are Jupyter notebooks?
Navigate to the folder where this lecture notebook is. Then, using cat
(and autocomplete, if you're lazy), print the file "Lab-4-Lesson.ipynb" to the command line.
Weird, isn't it?
This isn't even a command line program, it's more of a metaprogram.
Let's say you encounter a weird command that you don't recognize, like cal
. (Pretend I didn't show you cal
earlier on.) You want to know everything about how cal
works, what it does, and how to use it. And you don't have Google.
Go to your command line and type man cal
. See what pops up?
You can do look at the man page for any shell command that's installed on your computer. It'll give you a basic overview of the command, its options, and how to use it. With that in mind...
Okay, so, you've seen a bunch of bash
commands, and you know how to figure out what an arbitrary command does with man pages, so here's your assignment:
bash
instance, and some of them may not work on the supercomputer. If you get something that you don't understand or that doesn't work, pick something else.)man
to figure out what it does and what options and flags it has.bash
commands have Wikipedia pages that are quite informative.As an example, here's what I'd fill out for cat
.
Command: cat
Short for: concatenate
What does it do? prints out the whole contents of the file to the command line
Examples of usage:
cat test.txt
- will print out test.txt
cat -b example.txt
- will print out example.txt with the lines numbered
Now, you're on your own! The space for you to write is in the Markdown cell below. And remember, you can make something appear like code
by surrounding it with backticks (`).
Command:
Short for:
What does it do?
Examples of usage:
In addition to allowing you to automate commands with scripting, the command line lets you plug commands and files into one another basically ad infinitum. What does that mean? Well, let's say you didn't know how to use wildcards, and you wanted to only print out the filenames of text files in a directory.
For the sake of the example, you can't use ls *.txt
. So, what if you could tell another program, like grep
, which searches through files for text, to search through what ls
produces?
You could copy the output of ls
, paste it into a file, and then grep
through the file. But that's messy and inelegant, and the command line is nothing if not elegant.
You can pipe programs into each other. Using the |
character (which is called a pipe, and is produced by typing Shift-), you basically hook up the output from one program into the input of another. So, let's search for text files in the output of ls
, shall we?
The command to run is ls | grep ".txt"
. Don't worry too much about grep
there, just trust that it'll only print a line if it contains whatever's in quotes.
What's being done here is, instead of printing the output of ls
, it's being given to grep
to chew through. It's like you literally hooked up a physical pipe for ls
to put its output into, and then hooked that pipe up to grep
's input.
You can do the same thing with files! The echo
command just prints whatever text you want onto the command line. Pretty boring, huh. But when combined with redirection, you can create files with whatever text you want!
You redirect a program's output to a file using >
. So, to echo text to a file, you'd type echo "some text" > filename.txt
. This can be very useful for creating a lot of files at once, say, if you're trying to generate a bunch of files for every year and letter of the alphabet? Hmmm...
We discussed communication, its different forms, and infrastructure systems in lecture last week. Let's relate communciation to the command line!
How would you classify communication with the computer with the command line: unicast, multicast, or broadcast? Could it fall into multiple categories under different circumstances?
What benefits does communicating with the computer through the command line offer over the graphical user interface? What about the opposite?
That's it for today! Hopefully, you were convinced that the command line is awesome. And also the worst thing ever.
By way of review:
cd
, see what you're looking at with ls
, and find out where you are with pwd
.cat
, move anything with mv
, copy anything with cp
, create directories with mkdir
, and edit text with nano
.man
it.python
.