# Pay no attention to this cell # All will be revealed in due time. import pandas as pd from IPython.display import Image syllabus=pd.read_csv('Datasets/syllabus_2019.csv',header=0) syllabus=syllabus.fillna("") syllabus.index = range(1,len(syllabus)+1)
Computers are essential to all modern Earth Science research. We use them for compiling and analyzing data, preparing illustrations like maps or data plots, writing manuscripts, and so on. In this class, you will learn to write computer programs with special applications useful to Earth Scientists. We will learn Python, an object-oriented programming language, and use Jupyter notebooks to write our Python programs.
So, why learn Python? Because it is:
|1||Intro to notebooks, file systems and paths|
|2||Variables and Operations|
|4||Dictionaries, program loops (if, while and for)|
|5||functions and modules|
|6||NumPy and matplotlib|
|8||Pandas, file I/O|
|9||data wrangling with Pandas|
|10||object oriented programming|
|11||lambda, map, filter reduce, list comprehension|
|12||Pandas filtering and exceptions|
|13||subplots, bar charts pie charts|
|14||histograms and cumulative distribution functions|
|16||line and curve fitting|
|17||visualization with seaborn|
|19||gridding and contouring|
|20||rose diagrams and equal area projections|
|21||matrix math - dot and cross products|
|22||plotting great and small circles|
|23||Graphical User Interfaces (GUIs) and animations|
|25||3D plots of points and surfaces|
|26||Time series - periodograms|
Now we get down to business. In this lecture we will:
This class is entirely structured around a special programming environment called Jupyter notebooks. A Jupyter notebook is a development environment where you can write, debug, and execute your programs.
If you are taking this class through UCSD, make a folder on your Desktop to keep material for this class. If you haven't already, download the zip file for this lecture from TritonEd. Unzip the file and put the folder into your class folder.
If you are using the version cloned from github you already have everything. Some of the lectures might be updated in the future though, so the version you have may not be final.
To do this, you will need to discover the hidden secret of your computer, the Terminal window (or Anaconda Prompt). This little window provides a command line interface in which you can type commands to the operating system. You can find the terminal window through the program Terminal on a Mac by typing terminal.app into the search icon. If you double click on it, it will open a terminal window.
On a PC, you should use the Anaconda Prompt which you can find in your programs after you install Anaconda Python.
Let's open a terminal window (command prompt).
When you fire up a terminal window, you are by default in your home directory (in MacOS UNIX, that would be /Users/YOURUSERNAME).
To launch a Jupyter notebook, simply type jupyter notebook as shown above. That will open up a Browser window (use Firefox, Safari or Chrome - NOT Windows Explorer). Find your class folder and click on Lecture_01.ipynb. You should now be looking at this notebook!
Jupyter notebooks have two basic cells:
Markdown: for typesetting notes. This cell is an example of a markdown cell. Here is a "cheatsheet" for markdown typesetting: https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed if you are hungry for more.
Code: for writing python code
You can insert a new cell by selecting Insert Cell Below in the drop-down menu:
You change the cell "flavor" with the menu that defaults to 'Code' and can be changed to "Markdown".
And you "execute" a cell (either typeset or run the code) by clicking on the run key (sideways triangle with vertical line) or select Run Cells under the Cell drop-down menu.
In a code block, you can only type valid python statements EXCEPT
after a pound sign (#) - everything after that will be ignored.
That is how you write "comments" in your code to remind yourself or tell others what you were thinking:
# I can type anything here but not here
File "<ipython-input-8-bee698e92c8a>", line 2 but not here ^ SyntaxError: invalid syntax
That was an example of a bug which oculd be fixed by commenting out the second line, or making it a valid statement:
# I can type anything here # but not here print ("but not here")
but not here
Now open the notebook called Lecture_01_Practice_Problems and complete the first three tasks. Then come back to this notebook.
Congratulations! You just wrote your first Python program.
Now we will discuss file systems, paths, and the command line. Why? Because whenever you import an image, document, or spreadsheet into the Jupyter notebook you have to tell Jupyter where in the computer the file is located. Moreover, there are many command line functions that come in handy. For example, you can look at the first few lines of a file before you import it into the notebook. You could also write all of your programs in a text editor and run those programs from the command line. You could then run your programs from anywhere on your computer instead of a jupyter notebook. We will do that in Lecture 23, for example.
The organization of computers is based on a file system. The file system is hierarchical, so at the top you'll find the root directory or for Mac and PC users, a folder. The root directory contains files and other folders which may also contain files and folders and etc. This continues, resulting in a tree of files and folders that make up the file system. The following figure is an example of a computer's file system:
You are probably familiar with the images like that to the left. The text to the right shows the exact same thing - but from your computer's viewpoint. Both the image to the left and the text to the right show you how to access the folder "Desktop". On the left, you access the folder "Desktop" by clicking on 'icons' that represent different folders and sub-folders until you arrive at "Desktop". Later in this lecture, we'll show you how to access the same folder using its path (the text to the right).
Macs and PCs both have functions that can be called from a command line, such as listing the contents of a folder or file, creating new folders, changing permissions on files or folders, combining the contents of files, moving files and folders around, and so on. These commands are directed to the operating system instead of the Python interpreter.
Before we begin using commands, we can execute many operating system commands from within a Jupyter notebook. To signal to Jupyter that your commands are not for Python but for the operating system, you may type a "!" (bang) in front of the command. [It isn't actually a requirement, but it does help separate what is a command line command versus what is a python script.]
Let's learn our first UNIX command, which lists the contents of a directory, ls.
Datasets Lecture_01_Practice_Problems.ipynb Figures Lecture_01_syllabus.ipynb
Another useful command is mkdir which creates a new directory. Please note that directory means the same thing as folder. It is just that in a graphical operating system with icons, the term folder makes sense. They look like folders. Whereas to the operating system, they are traditionally referred to as directories. Never mind!
To see if that worked, list the contents again:
Datasets Lecture_01_syllabus.ipynb Figures MYNEWDIRECTORY Lecture_01_Practice_Problems.ipynb
And sure enough, there it is. The command rmdir deletes a directory
Make sure it was removed:
Datasets Lecture_01_Practice_Problems.ipynb Figures Lecture_01_syllabus.ipynb
We can list the contents of a file with the UNIX command, cat (which comes from concatenate).
Hi there students! Thanks for joining this class!
Usually, the output of cat is sent to the screen, but UNIX has tricky ways of redirecting output to other files. For example, if we combine cat with the symbol > we can redirect the output to another file, instead of to the screen:
!cat Datasets/myfile.txt >newfile.txt !ls
Datasets Lecture_01_syllabus.ipynb Figures newfile.txt Lecture_01_Practice_Problems.ipynb
So what did we create? We created a copy of myfile.txt called newfile.txt. If you repeat the command, you will overwrite the existing output file.
To append to the end of a file (actually concatenate), we use the symbol >>:
!cat Datasets/myfile.txt >>newfile.txt !cat newfile.txt
Hi there students! Thanks for joining this class! Hi there students! Thanks for joining this class!
There are a few other useful redirect symbols: <, and |. The first, <, takes the contents of the argument and redirects it into the command. The second, |, pipe,takes the output of the first command and 'pipes' it to a second. So we could do:
!cat Datasets/myfile.txt |cat
Hi there students! Thanks for joining this class!
... which is a little silly as it just does the same thing as the first command, but you don't know any other commands right now so...
So far, we have just looked at directories in our working directory (the one with this notebook in it) and subdirectories within the working directory. Earlier in the lecture you were shown a figure with icons on the left and text on the right. The text to the right was a series of directories separated by '/'. These are the paths to those files. A path is the unique location of a file or a directory in a file system of an OS.
Now that you know more about paths, let's take a detour and learn how to embed figures directly into a Jupyter notebook. You saw this in several lectures, but were told to ignore it. The Image class in the module Ipython.display allows us to embed many digital image types (png, jpg...) into a Jupyter notebook. If you take a look at the first cell of this lecture, we have already imported Image from Ipython.display.
If you want to display a figure, you will use Image and the path to the figure. The path to the figure we want to display is "Figures/FileSystem.jpg". This tells the operating system to find the folder labeled "Figures" and then grab the file inside that is labeled "FileSystem.jpg". This is a relative path because the location is with respect to the directory that the notebook is in.
The paths in this figure are absolute paths which uniquely define the location of the file or directory from anywhere on the computer. The relative paths are handy short cuts. For example, we can refer to a directory above the current directory without knowing what that is necessarily, we use these conventions:
./ is the current directory
../ is the one above
../../ is the one above that
and so on.
Instead of using 'relative' directories, it is often desirable to refer to directories in an absolute sense, i.e., relative to the root directory '/'.
To find out what the absolute path for your current directory, use pwd for 'print working directory':
To refer your home directory, just use the short cut ~:
AnacondaProjects Movies VirtualBox VMs Applications MultiDrive anaconda3 Creative Cloud Files Music bin Cubit-13.0 Pdfs enthought Desktop Pictures log4j Documents PmagPy logs Downloads Programs personal_stuff Dropbox Public profiles.bin Google Drive Python reprints Library Sites src MagIC SpareRoom webpasswords Meetings_2018 TeXShop
I guess I should clean up my Desktop!
To move from one directory to another, use the command cd (for change directory). You can cd to a directory using any of the following:
As mentioned in the beginning of the lecture, you can run all the little programs you have been (and will be) writing, directly from the command line. Here's one way to do that that uses one of the many ["magic" commands] (https://ipython.readthedocs.io/en/stable/interactive/magics.html#cell-magics) that work with Jupyter notebooks. Our first is:
which writes the contents of a cell to the specified text file.
Running this cell will place the contents of it (without the magic command) into a file in this directory called hello.py.
%%writefile ./hello.py print ("Hello World!")
Now you can run the program from your command line (after navigating to this directory) by typing:
$ python hello.py
or from within this notebook:
Alternatively, you can use a different magic command: %run to execute an external file:
To run the program on a Mac without a python command first from the command line, you need to do a few additional things.
1) You have to put this line at the top of the script:
This won't hurt you on a PC, it just isn't necessary.
%%writefile hello.py #!/usr/bin/env python print ("Hello World!")
2) The script must be executable. To find out whether a particular script is executable, type:
ls -al YOURSCRIPTNAME:
here it is in the notebook:
!ls -al hello.py
-rw-r--r-- 1 ltauxe staff 45 Feb 8 14:21 hello.py
The '[email protected]' string at the front indicates who can do what with the script. The first three letters '-rw' say that the 'user' (me) can read and write (but not execute) the script. The next three are for the 'group' and the third are for anyone (all).
To make it executable, I need to use the Unix command: chmod to set the permissions. To make it executable for everyone, I type:
where the 'a' means all and the 'x' means 'executable'.
!chmod a+x hello.py !ls -al hello.py
-rwxr-xr-x 1 ltauxe staff 45 Feb 8 14:21 hello.py
see how everybody has an 'x' now? Now you can run it either from the command line by typing
or from within the notebook:
3) The last thing you have to worry about is that the directory containing the script must be in your PATH. We have been talking about paths (all lower case), but PATH is an environment variable on Unix-like operating systems and also DOS (what is used by PCs) specifying a set of directories where the operating system looks for executable programs. So to run a program it must be in your PATH. And to run a Python program from anywhere, it must be in your PYTHONPATH.
You can find out what your PATH is, by typing echo $PATH on the command line (or in the notebook as shown here):
!echo $PATH # your results will definately vary!
By default, your working directory will not be in your path (some security reason), so to run a script that is in your working directory, you must either put it in your PATH (not recommended) or use the full path name or the relative path name, e.g.,
Changing your PATH depends a lot on your particular operating system. Most recent Macs set the path in a hidden file in your home directory called .bash_profile. I recommend that you put all your Python scripts in some directory (say, Python) in your user directory. Then you can put these lines in your .bash_profile file (for example using cat>>.bash_profile).
(followed by control-D when using cat).
When you open a new terminal window, your PATH environment variable should be set properly. If it is, you can use your python scripts from any directory and also import them into a Jupyter notebook.
#clean up a bit !rm hello.py