In [1]:
# Pay no attention to this cell
# All will be revealed in due time.
import pandas as pd
from IPython.display import Image
syllabus=syllabus.fillna("")
syllabus.index = range(1,len(syllabus)+1)

## Python Programming for Earth Science Students¶

Authors: Lisa Tauxe, [email protected], Hanna Asefaw, [email protected], & Brendan Cych, [email protected]

### Computers in Earth Science¶

Computers are essential to all modern Earth Science research. We use them for compiling and analyzing data, preparing illustrations like maps or data plots, writing manuscripts, and so on. In this class, you will learn to write computer programs with special applications useful to Earth Scientists. We will learn Python, an object-oriented programming language, and use Jupyter notebooks to write our Python programs.

### Python¶

So, why learn Python? Because it is:

• Flexible, freely available, cross platform
• Easier to learn than many other languages
• It has many numerical, statistical and visualization packages
• It is well supported and has lots of online documentation
• The name 'Python' refers to 'Monty Python' - not the snake - and many examples in the Python documentation use jokes from the old Monty Python skits. If you have never heard of Monty Python, look it up on youtube; you are in for a treat.

Which Python?

• Python is undergoing a transition from 2.7 to 3. The notebooks in this class, apart from a few exceptions, are compatible with both but they have only been tested on Python 3, so that is what you should be using.
• If you decide to use a personal computer, we recommend that you install the most recent version of Anaconda python for your operating system: https://www.anaconda.com/download/ you will also need a few extra packages (cartopy, version 0.17.0 and PySimpleGUI) which can be installed with little hassle.
In [3]:
syllabus[['Topic']]
Out[3]:
Topic
1 Intro to notebooks, file systems and paths
2 Variables and Operations
3 Data structures
4 Dictionaries, program loops (if, while and for)
5 functions and modules
6 NumPy and matplotlib
7 NumPy arrays
8 Pandas, file I/O
9 data wrangling with Pandas
10 object oriented programming
11 lambda, map, filter reduce, list comprehension
12 Pandas filtering and exceptions
13 subplots, bar charts pie charts
14 histograms and cumulative distribution functions
15 statistics 101
16 line and curve fitting
17 visualization with seaborn
18 maps
19 gridding and contouring
20 rose diagrams and equal area projections
21 matrix math - dot and cross products
22 plotting great and small circles
23 Graphical User Interfaces (GUIs) and animations
24 Machine learning
25 3D plots of points and surfaces
26 Time series - periodograms
27 Animations

## Lecture 1¶

Now we get down to business. In this lecture we will:

• Learn to find your command line interface.
• Learn how to launch a Jupyter notebooks from the command line interface
• Learn basic notebook anatomy.
• Learn some basic UNIX commands
• Learn about the concept of PATH
• Turn in your first practice problem notebook.

### Jupyter notebooks¶

This class is entirely structured around a special programming environment called Jupyter notebooks. A Jupyter notebook is a development environment where you can write, debug, and execute your programs.

If you are taking this class through UCSD, make a folder on your Desktop to keep material for this class. If you haven't already, download the zip file for this lecture from TritonEd. Unzip the file and put the folder into your class folder.

If you are using the version cloned from github you already have everything. Some of the lectures might be updated in the future though, so the version you have may not be final.

### Launching a Jupyter notebook¶

To do this, you will need to discover the hidden secret of your computer, the Terminal window (or Anaconda Prompt). This little window provides a command line interface in which you can type commands to the operating system. You can find the terminal window through the program Terminal on a Mac by typing terminal.app into the search icon. If you double click on it, it will open a terminal window.

In [4]:
Image(filename='Figures/terminal_mac.png')
Out[4]:

On a PC, you should use the Anaconda Prompt which you can find in your programs after you install Anaconda Python.

Let's open a terminal window (command prompt).

In [5]:
Image(filename='Figures/terminal.png',width=500)
Out[5]:

When you fire up a terminal window, you are by default in your home directory (in MacOS UNIX, that would be /Users/YOURUSERNAME).

To launch a Jupyter notebook, simply type jupyter notebook as shown above. That will open up a Browser window (use Firefox, Safari or Chrome - NOT Windows Explorer). Find your class folder and click on Lecture_01.ipynb. You should now be looking at this notebook!

### Jupyter notebook anatomy¶

Jupyter notebooks have two basic cells:

You can insert a new cell by selecting Insert Cell Below in the drop-down menu:

In [6]:
Image(filename='Figures/insertCell.png')
Out[6]:

Cell types default to 'Code' but you change the cell type to "Markdown" with the box labeled 'Markdown" on the menu bar. Click on the little downward arrow to change this cell to Code. Be sure to change it back!

You "execute" a cell (either typeset or run the code) by clicking on the run key (sideways triangle with vertical line) or select Run Cells under the Cell drop-down menu.

In [7]:
Out[7]:

In a code block, you can only type valid python statements EXCEPT after a pound sign (#) - everything after that will be ignored.
That is how you write "comments" in your code to remind yourself or tell others what you were thinking:

In [8]:
# I can type anything here
but not here
File "<ipython-input-8-bee698e92c8a>", line 2
but not here
^
SyntaxError: invalid syntax

That was an example of a bug which oculd be fixed by commenting out the second line, or making it a valid statement:

In [9]:
# I can type anything here
# but not here
print ("but not here")
but not here

### Practice Problems¶

Now open the notebook called Lecture_01_Practice_Problems. To open it, click on "File" and select "Open", then if file called Lecture_01_Practice_Problems.ipynb is visible, just click on it. But for the github version of this class, all the Practice Problems are in a folder called "Practice Problems". Click on that, then open the Lecture_01_Practice_Problems notebook.

Complete the first three tasks. Then come back to this notebook.

Congratulations! You just wrote your first Python program.

### Basic UNIX commands¶

Now we will discuss file systems, paths, and the command line. Why? Because whenever you import an image, document, or spreadsheet into the Jupyter notebook you have to tell Jupyter where in the computer the file is located. Moreover, there are many command line functions that come in handy. For example, you can look at the first few lines of a file before you import it into the notebook. You could also write all of your programs in a text editor and run those programs from the command line. You could then run your programs from anywhere on your computer instead of a jupyter notebook. We will do that in Lecture 23, for example.

### File systems¶

The organization of computers is based on a file system. The file system is hierarchical, so at the top you'll find the root directory or for Mac and PC users, a folder. The root directory contains files and other folders which may also contain files and folders and etc. This continues, resulting in a tree of files and folders that make up the file system. The following figure is an example of a computer's file system:

In [10]:
Image(filename='Figures/FileSystem.jpg')
Out[10]:

You are probably familiar with the images like that to the left. The text to the right shows the exact same thing - but from your computer's viewpoint. Both the image to the left and the text to the right show you how to access the folder "Desktop". On the left, you access the folder "Desktop" by clicking on 'icons' that represent different folders and sub-folders until you arrive at "Desktop". Later in this lecture, we'll show you how to access the same folder using its path (the text to the right).

### Survival UNIX commands¶

Macs and PCs both have functions that can be called from a command line, such as listing the contents of a folder or file, creating new folders, changing permissions on files or folders, combining the contents of files, moving files and folders around, and so on. These commands are directed to the operating system instead of the Python interpreter.

Before we begin using commands, we can execute many operating system commands from within a Jupyter notebook. To signal to Jupyter that your commands are not for Python but for the operating system, you may type a "!" (bang) in front of the command. [It isn't actually a requirement, but it does help separate what is a command line command versus what is a python script.]

Let's learn our first UNIX command, which lists the contents of a directory, ls. For PC users, this is dir.

In [11]:
!ls
# or !dir for PC users
Datasets                           Lecture_01_Practice_Problems.ipynb
Figures                            Lecture_01_syllabus.ipynb

Another useful command is mkdir which creates a new directory. Please note that directory means the same thing as folder. It is just that in a graphical operating system with icons, the term folder makes sense. They look like folders. Whereas to the operating system, they are traditionally referred to as directories. Never mind!

In [12]:
!mkdir MYNEWDIRECTORY

To see if that worked, list the contents again:

In [13]:
!ls
# or !dir for PC users
Datasets                           Lecture_01_syllabus.ipynb
Figures                            MYNEWDIRECTORY
Lecture_01_Practice_Problems.ipynb

And sure enough, there it is. The command rmdir deletes a directory

In [14]:
!rmdir MYNEWDIRECTORY

Make sure it was removed:

In [15]:
!ls
# or !dir for PC users
Datasets                           Lecture_01_Practice_Problems.ipynb
Figures                            Lecture_01_syllabus.ipynb

We can list the contents of a file with the UNIX command, cat (which comes from concatenate). [For PC users, type works. BUT, you have to use "backslash" (\) instead of "forward slash" (/). ]

In [3]:
!cat Datasets/myfile.txt
# or !type Datasets\myfile.txt   for PC users
Hi there students! Thanks for joining this class!

Usually, the output of cat is sent to the screen, but UNIX has tricky ways of redirecting output to other files. For example, if we combine cat with the symbol > we can redirect the output to another file, instead of to the screen:

In [2]:
!cat Datasets/myfile.txt >newfile.txt
!ls
# or !type, !dir for PC users
cat: Datasets/myfile.txt: No such file or directory
Background                   Lecture_18.ipynb
ContRot                      Lecture_19.ipynb
Datasets                     Lecture_20.ipynb
Figures                      Lecture_21.ipynb
GUI_command_line.ipynb       Lecture_22.ipynb
GUI_command_line_maker.ipynb Lecture_23.ipynb
Lecture_01.ipynb             Lecture_24.ipynb
Lecture_02.ipynb             Lecture_25.ipynb
Lecture_03.ipynb             Lecture_26.ipynb
Lecture_04.ipynb             Lecture_27.ipynb
Lecture_05.ipynb             Maps
Lecture_06.ipynb             Practice_Problems
Lecture_09.ipynb             _TableOfContents.ipynb
Lecture_10.ipynb             dogit
Lecture_11.ipynb             environment.yml
Lecture_12.ipynb             mkigrf.py
Lecture_13.ipynb             mknb
Lecture_14.ipynb             nets.py
Lecture_15.ipynb             newfile.txt
Lecture_16.ipynb             notebook.tex
Lecture_17.ipynb             requirements.bak

So what did we create? We created a copy of myfile.txt called newfile.txt. If you repeat the command, you will overwrite the existing output file.

To append to the end of a file (actually concatenate), we use the symbol >>:

In [19]:
!cat Datasets/myfile.txt >>newfile.txt
!cat newfile.txt
Hi there students! Thanks for joining this class!
Hi there students! Thanks for joining this class!

There are a few other useful redirect symbols: <, and |. The first, <, takes the contents of the argument and redirects it into the command. The second, |, pipe,takes the output of the first command and 'pipes' it to a second. So we could do:

In [20]:
!cat Datasets/myfile.txt | cat
Hi there students! Thanks for joining this class!

... which is a little silly as it just does the same thing as the first command, but you don't know any other commands right now so...

### Concept of path¶

So far, we have just looked at directories in our working directory (the one with this notebook in it) and subdirectories within the working directory. Earlier in the lecture you were shown a figure with icons on the left and text on the right. The text to the right was a series of directories separated by '/'. These are the paths to those files. A path is the unique location of a file or a directory in a file system of an OS.

Now that you know more about paths, let's take a detour and learn how to embed figures directly into a Jupyter notebook. You saw this in several lectures, but were told to ignore it. The Image class in the module Ipython.display allows us to embed many digital image types (png, jpg...) into a Jupyter notebook. If you take a look at the first cell of this lecture, we have already imported Image from Ipython.display.

If you want to display a figure, you will use Image and the path to the figure. The path to the figure we want to display is "Figures/FileSystem.jpg". This tells the operating system to find the folder labeled "Figures" and then grab the file inside that is labeled "FileSystem.jpg". This is a relative path because the location is with respect to the directory that the notebook is in.

In [21]:
Image(filename='Figures/FileSystem.jpg')
Out[21]:

The paths in this figure are absolute paths which uniquely define the location of the file or directory from anywhere on the computer. The relative paths are handy short cuts. For example, we can refer to a directory above the current directory without knowing what that is necessarily, we use these conventions:

./ is the current directory

../ is the one above

../../ is the one above that

and so on.

Instead of using 'relative' directories, it is often desirable to refer to directories in an absolute sense, i.e., relative to the root directory '/'.

To find out what the absolute path for your current directory, use pwd for 'print working directory':

In [22]:
!pwd
/Users/ltauxe/Dropbox/4SIO113_2019/Lecture_01

To refer your home directory, just use the short cut ~:

In [23]:
!ls
AnacondaProjects     Movies               VirtualBox VMs
Applications         MultiDrive           anaconda3
Creative Cloud Files Music                bin
Cubit-13.0           Pdfs                 enthought
Desktop              Pictures             log4j
Documents            PmagPy               logs
Dropbox              Public               profiles.bin
Google Drive         Python               reprints
Library              Sites                src
Meetings_2018        TeXShop

I guess I should clean up my Desktop!

To move from one directory to another, use the command cd (for change directory). You can cd to a directory using any of the following:

• cd ~ (takes you to your home directory)
• cd Figures (takes you to the Figures folder in the working directory
• cd FULL_PATH_NAME (to change into any directory with its full path name
• cd .. (move to the directory above you)
• cd ../.. (move two directories up)
• and so on.

### Command line python scripts¶

As mentioned in the beginning of the lecture, you can run all the little programs you have been (and will be) writing, directly from the command line. Here's one way to do that that uses one of the many ["magic" commands] (https://ipython.readthedocs.io/en/stable/interactive/magics.html#cell-magics) that work with Jupyter notebooks. Our first is:

%%writefile PATH_TO_FILE.py

which writes the contents of a cell to the specified text file.

Running this cell will place the contents of it (without the magic command) into a file in this directory called hello.py.

In [2]:
%%writefile ./hello.py
print ("Hello World!")
Writing ./hello.py

Now you can run the program from your command line (after navigating to this directory) by typing:

$python hello.py or from within this notebook: In [25]: !python hello.py Hello World! Alternatively, you can use a different magic command: %run to execute an external file: In [4]: %run hello.py Hello World! To run the program on a Mac without a python command first from the command line, you need to do a few additional things. 1) You have to put this line at the top of the script: # !/usr/bin/env python¶ This won't hurt you on a PC, it just isn't necessary. In [5]: %%writefile hello.py #!/usr/bin/env python print ("Hello World!") Overwriting hello.py 2) The script must be executable. To find out whether a particular script is executable, type: ls -al YOURSCRIPTNAME: here it is in the notebook: In [6]: !ls -al hello.py -rw-r--r-- 1 ltauxe staff 45 Feb 8 14:21 hello.py The '[email protected]' string at the front indicates who can do what with the script. The first three letters '-rw' say that the 'user' (me) can read and write (but not execute) the script. The next three are for the 'group' and the third are for anyone (all). To make it executable, I need to use the Unix command: chmod to set the permissions. To make it executable for everyone, I type: chmod a+x where the 'a' means all and the 'x' means 'executable'. In [7]: !chmod a+x hello.py !ls -al hello.py -rwxr-xr-x 1 ltauxe staff 45 Feb 8 14:21 hello.py see how everybody has an 'x' now? Now you can run it either from the command line by typing$ ./hello.py

or from within the notebook:

In [8]:
!./hello.py
Hello World!

3) The last thing you have to worry about is that the directory containing the script must be in your PATH. We have been talking about paths (all lower case), but PATH is an environment variable on Unix-like operating systems and also DOS (what is used by PCs) specifying a set of directories where the operating system looks for executable programs. So to run a program it must be in your PATH. And to run a Python program from anywhere, it must be in your PYTHONPATH.

You can find out what your PATH is, by typing echo $PATH on the command line (or in the notebook as shown here): In [9]: !echo$PATH # your results will definately vary!
/Users/ltauxe/anaconda3/bin:/Users/ltauxe/anaconda3/bin:/usr/local/Cellar/cmake/3.9.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/texbin:/Library/TeX/texbin::/Users/ltauxe/PmagPy/programs/__pycache__/:/Users/ltauxe/PmagPy/programs/conversion_scripts/:/Users/ltauxe/PmagPy/programs/conversion_scripts2/:/Users/ltauxe/PmagPy/programs/deprecated/:/Users/ltauxe/PmagPy/programs/images/:/Users/ltauxe/PmagPy/programs/:/Applications/GMT-5.3.1.app/Contents/Resources/bin

By default, your working directory will not be in your path (some security reason), so to run a script that is in your working directory, you must either put it in your PATH (not recommended) or use the full path name or the relative path name, e.g.,

./hello.py

Changing your PATH depends a lot on your particular operating system. Most recent Macs set the path in a hidden file in your home directory called .bash_profile. I recommend that you put all your Python scripts in some directory (say, Python) in your user directory. Then you can put these lines in your .bash_profile file (for example using cat>>.bash_profile).

export PYTHONPATH=\$PYTHONPATH:~/Python

(followed by control-D when using cat).

When you open a new terminal window, your PATH environment variable should be set properly. If it is, you can use your python scripts from any directory and also import them into a Jupyter notebook.

In [10]:
#clean up a bit
!rm hello.py