This lecture is based on your readings for the week: Tracy Osborn's Really Friendly Command Line Intro; Software Carpentry's The Unix Shell; and the optional essay by Neal Stephenson, In the Beginning was the Command Line. For your reference:
Here's a short description detailing the different terms that people use for the command line and different versions of it. If you're confused by some of the verbiage or would like a bit more depth on the different types of command lines people use, check it out here. If you're a bit shakey on some of the terminology, it might be information overload for you, so don't feel like it's mandatory to read.
There are a few in-class short exercises that deal with what you'll be learning today, to be completed in small groups of two or three. There is not, however, a lab exercise notebook for you to complete over the course of the week, so you have a bit of a break from lab work this week. Instead, make a GitHub account, get the URL to your profile, and submit it to Canvas.
Look to the left of this notebook. If you haven't minimized it, it's going to be a list of files and folders. They're all things that you've created or uploaded to your personal folder on the supercomputer.
On your personal computer, you've got a similar thing going: files organized into folders. For me on Mac, it looks a little something like this:
If you use Windows, it'll probably look like this:
Operating systems come with programs like Finder and Windows Explorer to help us navigate, organize, and view all of our files, folders, and programs. But, this is just a simplification of the truth. In reality, your files are, quite literally, an array of ones and zeros on your hard drive. Calling them "files" and organizing them into "folders" is just a metaphor.
The legendary computer scientist Donald Knuth said, in an interview, the following:
The psychological profiling [of a computer scientist] is mostly the ability to shift levels of abstraction, from low level to high level. To see something in the small and to see something in the large.
Abstraction is the key word there; the idea that almost everything we think is simple actually has a lot more going on underneath the surface is of vital importance to computer science. Abstraction is basically the idea that sometimes it's beneficial to ignore a bunch of details when describing some part of how a computer or program works.
Take this little Python program as an example:
def add_two(num1, num2): num1 = int(num1) num2 = int(num2) sum = num1 + num2 return sum num1 = input("Enter one number: ") num2 = input("Enter a second: ") print("Your sum is", add_two(num1, num2))
How would you describe what your program does to someone who's never programmed?
Would you tell them about variables? What a function is? Converting from a string to an integer? Of course not.
What you're doing is abstraction: glossing over details to serve a functional purpose.
"Men are not disturbed by things, but the view they take of things."
— Epictetus (55-135 A.D.)
"What about things like bullets?"
— Behavioralist Herb Kimmel, upon hearing the above quote in 1981 (source)
It was said earlier that files and folders are just your computer using a metaphor to talk about data. Through that metaphor, your computer is abstracting the details of what files actually are from you. When you look at your files with a file browser like Finder, Windows Explorer, or the file viewer on JupyterHub, what you're looking at is an abstraction your computer tells you to make organizing your documents easier.
What file browsers are actually looking at is something called the file system. Like the name indicates, a file system (sometimes spelled as one word, filesystem) is a system for managing files. It's basically how your computer thinks about the data that's on your disk. It organizes it into structures to make your files easy to find, quick to access, and simple to change.
These layers of abstraction basically look like this:
Now, open up a terminal in Jupyter and put it side by side to this notebook. What do you see?
It's pretty austere, but you can do a lot with it.
Before we dive into the command line, open up the Jupyter file browser and take a look at your home folder. Keep the filenames and folders in mind for this next part. Now, go back to the terminal window, type in
ls, and hit Enter. (Don't worry about what
ls stands for right now, we'll get to that very soon.)
Notice how the things that the terminal printed out are the same as what's in your file browser? That's because the command line is just another way of looking at the files on a computer. File browsers and the command line are different ways of viewing your filesystem.
In the Really Friendly Introduction to the Command Line you read, you saw the command line being referred to as the Unix shell. Unix describes computer operating systems like Linux and Mac, but not Windows. Because Pitt's supercomputer is a Unix computer, the shell we're using is a Unix shell. You can read more about this terminology in the addendum for today's lecture here.
Unix shells all basically follow the same format: they list the username, computer's name, and the $, which can stand for "shell". Terminal on Mac, which is also a Unix shell, looks like this:
Let's take a break from the command line for just a minute to do some journaling. (No, seriously.)
Open up a blank text file in Jupyter and write what you're thinking right now. Just a few sentences, and then rename the file to be whatever name you want. Make sure you save it in your main directory!
Now, let's try
ls again. In case you haven't guessed,
ls lists the files in whatever directory you're in.
Good question! If you ever need to know the full address that you're working in on the command line, use the
pwd stands for print working directory. It'll tell you what folder you're currently looking at! (Directory is another word for folder on computers.)
So, when you
pwd (on the crc Jupyterhub), you'll get something that looks like this:
That's your working directory on the supercomputer.
ihome is the name of the folder that contains all the classes and users,
cmpinf0010_2019f is the folder for this course, and the last part is your username, which is also the name of your personal folder!
On your home computer, you should see something that looks like this:
That's your working directory on your own computer.
c is the name of the drive the files are located on,
username is whatever username you're using on the computer, etc, etc.
When you click on the little home icon in the file browser on Jupyter, it'll take you to that personal main folder. That's where you should save your journal, if you didn't already.
No, not that kind of cat. (Sad!) The shell command
cat, which is short for concatenate, will output the contents of any file to the terminal window.
You're going to use
cat to read the journal you wrote.
ls. Do you see the name of the text file you wrote your journal in? If you don't, make sure you saved your file in your main folder.
Now, it's time for
cat filename.txt (where filename is what you named your file. If you included spaces in your filename, put it in quotations. Additionally, make sure you get the case correctly.
Now, hit Enter. You should see your journal entry, printed out for the world to see. (Or, at least, for you.)
cd command allows you to change what directory you're currently looking at.
cd, as you may have guessed, stands for "change directory".
That last folder, the one we're currently in, is has the same name as our Pitt username.
Before we go navigating around willy-nilly, let's introduce the concept of autocompletion to you. If you don't have any folders in your main folder on JupyterHub, take the time now to create one.
mkdir command, which is short for "make directory", lets you, um, make a directory. Try doing
mkdir "hello world" (remember, the command line cares about spaces).
Then, type in
cd h and press the Tab key.
Whoa, what happened? Autocomplete. The command line can provide guesses on what it thinks you're going to type, which can save you a lot of typing. Not all commands have autocomplete (in fact,
cd really is the big one), but when they do, it's helpful to keep in mind.
cd into the folder we just created. Do
pwd once you're in there, just to see what it looks like, and try
ls. (There's nothing in the folder. Big surprise.)
Okay, so how do we get out? Type
cd ..: just like that, cd and two periods, nothing else. Press Enter. Now,
pwd to find out where the heck you are.
There are a few shortcuts that the command line provides for you that help navigating and managing your files and directories. They are as follows:
||the current directory|
||one directory "above" the current one|
||your main folder|
You'll use those a lot for navigating and moving files.
cd .. once more. Now where are you?
cd ~ to come back home, and then navigate to this week's repo. I'm sure you noticed, in this repo, a file with the extension
.py files are Python programs, like a code cell in a Jupyter notebook.
You can run Python programs on the command line using the
python command. Go ahead and run the file
whoareyou.py in this week's repo by typing
bashcommands vs. installed commands¶
If you're on a Mac (which uses the same type of command line that we're using now, called
bash) and you try to run a Python file from the command line, it should (theoreticlaly) work, because we've installed python. However, before the semester, there's a decent chance it wouldn't work. You would have gotten something like this:
That's because the
python command isn't installed automatically on all computers that have the
bash command line; it's not native to
bash. You're able to run
python because python gets installed as part of Anaconda on your machine.
But, you never installed a program called
cd. That's because those commands are native to
bash. They're included automatically.
There are literally hundreds of native
bash commands, for everything from displaying a calendar (
cal, unsurprisingly) to printing out the last 10 lines of a file (
There's a list of the default
bash commands here: https://ss64.com/bash/
Below are a list of shell commands that we've seen so far and a description of what we're hoping to accomplish with each one. The issue, is that the person who wrote them is not very good at the command line, so they have made a fatal mistake in each one. Your task is to fix the mistake in each of the commands. Good luck!
cat tax documents.txt
From the main repo, you should see the directory
cd to it, and look around a little bit.
testfiles directory, with all of the files in it, into your home folders. How are we going to do that?
mvcommand allows you to move files and folders around. It uses the format
mv source destination, so, if you wanted to move a file called test.txt to a folder called myfiles, you would type
mv test.txt myfiles. But,
mv's not what we want to do here.
cpcommand has the exact same format as
mv, but instead of moving, it'll make a copy wherever you specify.
cpare lazy. They'll only look at any files in your current directory. If you want to copy all of the files in a folder, you have to use the flag
-r. That stands for "recursive", which means that it'll copy again once you're inside a directory until you reach the end of the "tree".
So, after you've copied the test files to your home directory, open and refresh the file browser on Jupyter. See the "testfiles" directory there? You're welcome to poke around in it to make sure all the files are still there.
Now, what you need to do is delete every file that's from 2002. How would you do that using a normal, GUI file browser?
Doing these kinds of repetitive tasks using the command line is much, much easier, thanks to the magic of wildcards. Wildcards allow you to match any text of any length. There are two main wildcards:
*matches any text of any length. For example, if you wanted to match every Python file, the wildcard for that would be
?matches any single character. You can string them together as much as you want. For example, text files with five-character names which begin with "a" would be matched with
a????.txt. If you wanted to search for files with the years in the 1900s, you could use the
Question: how would you match a text file that begins with a person's name and ends with a number between 100 and 199?
You can delete files with the
rm command. This can also delete directories if you use the same
-r flag you did when you copied the test files over.
So, now remove all of the files in the
testfiles directory which contains "2002". How do you think we're going to do that?
You probably have some questions about what we've done so far, like:
Well, here's your answer:
fiddly (adjective, British)
complicated or detailed and awkward to do or use.the fht
You've experienced some fiddliness thus far in this class, and those of you that have done software development in other classes or on your own have certainly dealt with it, but the command line takes that to another level.
Fiddliness is a byproduct of the Unix aesthetic which, at its core, is a dislike of extra stuff: cutesy descriptions, unnecessary information, using an entire word when an abbreviation will do.
But it's worth getting through the fiddliness for one reason: the command line is incredibly powerful.
Below are two things: a description, in plain English, of a task we want to accomplish and a list of the relevant shell commands. The shell commands, however, are not in their proper order, and executing them as they are would wreak havoc on any system. Your task is to put the commands in order so we can do what we properly want to do. Good luck!
WE want to:
The commands are as follows, out of order:
cp -r ~/apples .
mv *.txt apples
mkdir apples oranges
Up to this point, you've only ever looked at files that either you created in Jupyter or ones that were provided for you. You can do text editing directly from the command line, without ever opening another window. A lot of software developers prefer using the command line to write code, for its power, customizability, and the ease of running and testing programs as you write them.
There are three main text editors for the command line:
nano: Also called
pico, this is the simplest and easiest to use text editor. It's pretty close to something like Notepad, but without the ability to use your mouse. This is definitely the one that's best for teaching basic text editing.
vim: One of the two main text editors that programmers like to use.
vimis noted for its simplicity, which borders on impossibility to use.
vimallows the user to switch between "modes", which allow you to manipulate, select, and insert text.
emacs: The other major programming text editor,
emacsis notorious for customizability and an incredible amount of features. There is a Twitter client for
emacsallows the user to create various "windows" inside the text editor for editing things side-by-side, running code while editing it, and even has its own programming language (emacs-lisp) for creating new features for it.
ed, which all have their strengths and uses. We won't really get into these ones.
Those bullet points can basically be summed up with this cartoon from xkcd:
Most major text editors allow you to put the name of a file as the argument for the shell command. For example, to open the file "test.txt" in
nano, you'd simply type
nano test.txt. On Windows, type
bash -c "nano" or
bash -c "nano test.txt. This works just as well for files that don't exist (yet). You can also simply type
nano to open up a blank file.
Making sure you're in your main folder, let's write another journal, like you did earlier.
nano filename.txt, where "filename.txt" is whatever you want to name your journal. If you don't enter a filename now, you'll put one in later.
nano, you will immediately be able to type in or edit text. If you opened a new or blank file, your screen will be pretty much empty. If your file wasn't empty, you'll see its contents!
nanoto save your files and quit. The shortcuts you can use are listed at the bottom of the screen. Most shortcuts that you will use start with the Control key, denoted by the caret sign
^. To save your file, hit Control and the letter O at the same time (
nanowill then ask you what you want the filename to be: if you typed one in earlier, it will show up here and you can just hit enter to confirm; if you haven't typed one in yet, you'll have to type one in now before hitting enter.
nano, you use the "quit" command, which is ``. You're done! That was painless.
So, what happened to your journal? Go back to the Jupyter file browser and refresh it. When you see your second journal file, open it up in Jupyter. There's your journal, safe and sound! Congratulations, you've just successfully written text on the command line. Welcome to the club; we've got some famous members.
So, the only files we've looked at, really, have been Python code and text files, which are both pretty boring, if we're being honest. You can use
cat to see what any file looks like "under the hood", so to speak. You can check out source code and see how things are arranged and stored in other files.
With that in mind, let's answer a question that must be burning in everyone's mind: what actually are Jupyter notebooks?
Navigate to the folder where this lecture notebook is. Then, using
cat (and autocomplete, if you're lazy), print the file "Week-5-Lab-Lesson.ipynb" to the command line.
Weird, isn't it?
This isn't even a command line program, it's more of a metaprogram.
Let's say you encounter a weird command that you don't recognize, like
cal. (Pretend I didn't show you
cal earlier on.) You want to know everything about how
cal works, what it does, and how to use it. And you don't have Google.
Go to your command line and type
man cal. See what pops up?
You can do look at the man page for any shell command that's installed on your computer. It'll give you a basic overview of the command, its options, and how to use it. With that in mind...
Okay, so, you've seen a bunch of
bash commands, and you know how to figure out what an arbitrary command does with man pages, so here's your assignment:
bashinstance, and some of them may not work on the supercomputer. If you get something that you don't understand or that doesn't work, pick something else.)
manto figure out what it does and what options and flags it has.
bashcommands have Wikipedia pages that are quite informative.
As an example, here's what I'd fill out for
Short for: concatenate
What does it do? prints out the whole contents of the file to the command line
Examples of usage:
cat test.txt - will print out test.txt
cat -b example.txt - will print out example.txt with the lines numbered
Now, you're on your own! The space for you to write is in the Markdown cell below. And remember, you can make something appear like
code by surrounding it with backticks (`).
What does it do?
Examples of usage:
In addition to allowing you to automate commands with scripting, the command line lets you plug commands and files into one another basically ad infinitum. What does that mean? Well, let's say you didn't know how to use wildcards, and you wanted to only print out the filenames of text files in a directory.
For the sake of the example, you can't use
ls *.txt. So, what if you could tell another program, like
grep, which searches through files for text, to search through what
You could copy the output of
ls, paste it into a file, and then
grep through the file. But that's messy and inelegant, and the command line is nothing if not elegant.
You can pipe programs into each other. Using the
| character (which is called a pipe, and is produced by typing Shift-), you basically hook up the output from one program into the input of another. So, let's search for text files in the output of
ls, shall we?
The command to run is
ls | grep ".txt". Don't worry too much about
grep there, just trust that it'll only print a line if it contains whatever's in quotes.
What's being done here is, instead of printing the output of
ls, it's being given to
grep to chew through. It's like you literally hooked up a physical pipe for
ls to put its output into, and then hooked that pipe up to
You can do the same thing with files! The
echo command just prints whatever text you want onto the command line. Pretty boring, huh. But when combined with redirection, you can create files with whatever text you want!
You redirect a program's output to a file using
>. So, to echo text to a file, you'd type
echo "some text" > filename.txt. This can be very useful for creating a lot of files at once, say, if you're trying to generate a bunch of files for every year and letter of the alphabet? Hmmm...
That's it for today! Hopefully, you were convinced that the command line is awesome. And also the worst thing ever.
By way of review:
cd, see what you're looking at with
ls, and find out where you are with
cat, move anything with
mv, copy anything with
cp, create directories with
mkdir, and edit text with