#!/usr/bin/env python # coding: utf-8 # In[4]: # Pay no attention to this cell # All will be revealed in due time. import pandas as pd import os from pathlib import Path from IPython.display import Image syllabus=pd.read_csv('Datasets/syllabus_2020.csv',header=0) syllabus=syllabus.fillna("") syllabus.index = range(1,len(syllabus)+1) # ## Python Programming for Earth Science Students # # Authors: Lisa Tauxe, ltauxe@ucsd.edu, Hanna Asefaw, hasefaw@ucsd.edu, & Brendan Cych, bcych@ucsd.edu # Instructor: Lisa Tauxe, ltauxe@ucsd.edu # TAs: Brendan Cych, bcych@ucsd.edu, Shelby Jones, saj012@ucsd.edu # # # # ### Computers in Earth Science # # Computers are essential to all modern Earth Science research. We use them for compiling and analyzing data, preparing illustrations like maps or data plots, writing manuscripts, and so on. In this class, you will learn to write computer programs with special applications useful to Earth Scientists. We will learn Python, an object-oriented programming language, and use Jupyter notebooks to write our Python programs. # # ### Python # # So, why learn Python? Because it is: # # - Flexible, freely available, cross platform # - Easier to learn than many other languages # - It has many numerical, statistical and visualization packages # - It is well supported and has lots of online documentation # - The name 'Python' refers to 'Monty Python' - not the snake - and many examples in the Python documentation use jokes from the old Monty Python skits. If you have never heard of Monty Python, look it up on youtube; you are in for a treat. # # Which Python? # - Python underwent a transition from 2.7 to 3. The notebooks in this class, apart from a few exceptions, are compatible with both but they have only been tested on Python 3, so that is what you should be using. # - If you decide to use a personal computer, we recommend that you install the most recent version of Anaconda python for your operating system: # https://www.anaconda.com/download/ # you will also need a few extra packages (cartopy, version 0.17.0 geopandas, version 0.7.0 and descartes, version 1.1.0) which can be installed with little hassle. # # # In[2]: syllabus[['Topic','Date','Assignment']] # ## Lecture 1 # # Now we get down to business. In this lecture we will: # # - Learn to find your command line interface. # - Learn how to launch a Jupyter notebook from the command line interface # - Learn basic notebook anatomy. # - Learn some basic python operating system commands # - Learn about the concept of **PATH** # - Turn in your first practice problem notebook. # # ### Jupyter notebooks and Jupyter Hub # # This class is entirely structured around a special programming environment called [Jupyter notebooks](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html). A Jupyter notebook is a development environment where you can write, debug, and execute your programs. # # If you are taking this class through UCSD, you will be using the Jupyter Hub site. When working on practice problems, rename the practice problem notebooks by going to File > Rename. Once finished, save and go to File > Download As > Notebook (.ipynb) and upload this notebook to canvas. If you don't want to install Python on your computer, skip to the 'Jupyter Notebook Anatomy' section # # Alternatively, you can install Anaconda Python on your machine (see below) and work on the lectures. If you want to be able to open and use notebooks after the class is over, you should do this. # # If you are using the version cloned from github you already have everything. Some of the lectures might be updated in the future though, so the version you have may not be final. # ### OPTIONAL: Installing Anaconda Python and Opening Jupyter Notebooks # # To install Anaconda Python, go to https://anaconda.org/ and follow the install instructions for your operating system. # To do this, you will need to discover the hidden secret of your computer, the _Terminal window_. This little window provides a _command line interface_ in which you can type commands to the operating system. You can find the terminal window through the program **Terminal** on a Mac by typing terminal.app into the search icon and double clicking on it. On Windows, use the start menu to search for the program 'Anaconda Prompt'. On Linux, press Ctrl+Alt+T to open the terminal. # In[3]: Image(filename='Figures/terminal_mac.png') # Let's open a terminal window and launch jupyter notebook. On PC, Mac or Linux, you can do this by typing # # `jupyter notebook` and hitting return # In[4]: Image(filename='Figures/terminal.png',width=500) # When you fire up a terminal window, you are by default in your **home** directory (in MacOS UNIX, that would be **/Users/YOURUSERNAME**). # # To launch a Jupyter notebook, simply type jupyter notebook as shown above. That will open up a Browser window. Find your class folder and click on Lecture_01.ipynb. You should now be looking at this notebook! # ### Make a copy of the lecture # # You should not modify this lecture, or if you do it is quite likely that it will be over-written if you update your directory with a new version. To do that, open the File menu at the top of this page: # In[15]: Image(filename='Figures/copy.png',width=600) # Choose 'Make a Copy'. This will protect the original and you can goof around with this one all you like. But first, you need to know a few things about jupyter notebooks. # ### Jupyter notebook anatomy # # Jupyter notebooks have two basic _cells_: # # - Markdown: for typesetting notes. This cell is an example of a markdown cell. Here is a "cheatsheet" for markdown typesetting: https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed if you are hungry for more. # # - Code: for writing python code # # You can insert a new cell by selecting Insert Cell Below in the drop-down menu: # In[16]: Image(filename='Figures/insertCell.png',width=600) # Cell types default to 'Code' but you change the cell type to "Markdown" with the box labeled 'Markdown" on the menu bar. Click on the little downward arrow to change this cell to Code. Be sure to change it back! # # You "execute" a cell (either _typeset_ or _run_ the code) by clicking on the run key (sideways triangle with vertical line) or select Run Cells under the _Cell_ drop-down menu. # In[17]: Image(filename='Figures/menuBar.png',width=600) # In a code block, you can only type valid python statements EXCEPT # after a pound sign (#) - everything after that will be ignored. # That is how you write "comments" in your code to remind yourself or tell others what you were thinking: # In[7]: # I can type anything here but not here # That was an example of a _bug_ which oculd be fixed by commenting out the second line, or making it a valid statement: # In[52]: # I can type anything here # but not here print ("but not here") # # ### Practice Problems # # Now open the notebook called Lecture_01_Practice_Problems. To open it, click on "File" and select "Open", then if file called Lecture_01_Practice_Problems.ipynb is visible, just click on it. But you are using the datahub or github versions of the class (most of you), all the Practice Problems are in a folder called "Practice\_Problems". Click on that icon, then open the Lecture_01_Practice_Problems notebook. # # Complete the first three tasks. Then come back to this notebook. # # Congratulations! You just wrote your first Python program. # ### Basic operating system commands # # Now we will discuss file systems, paths, and the command line. Why? Because whenever you import an image, document, or spreadsheet into the Jupyter notebook you have to tell Jupyter where in the computer the file is located. Moreover, there are many command line functions that come in handy. For example, you can look at the first few lines of a file before you import it into the notebook. You could also write all of your programs in a text editor and run those programs from the command line. You could then run your programs from anywhere on your computer instead of a jupyter notebook. We will do that in Lecture 23, for example. # # # ### File systems # # The organization of computers is based on a _file system_. The file system is hierarchical, so at the top you'll find the _root directory_ or for Mac and PC users, a _folder_. The root directory contains files and other folders which may also contain files and folders and etc. This continues, resulting in a tree of files and folders that make up the file system. The following figure is an example of a computer's file system: # # In[53]: Image(filename='Figures/FileSystem.jpg') # You are probably familiar with the images like that to the left. The text to the right shows the exact same thing - but from your computer's viewpoint. Both the image to the left and the text to the right show you how to access the folder "Desktop". On the left, you access the folder "Desktop" by clicking on 'icons' that represent different folders and sub-folders until you arrive at "Desktop". Later in this lecture, we'll show you how to access the same folder using its path (the text to the right). # ### Survival operating system commands # # Macs and PCs both have functions that can be called from a _command line_, such as listing the contents of a folder or file, creating new folders, changing permissions on files or folders, combining the contents of files, moving files and folders around, and so on. These commands are directed to the operating system instead of the Python interpreter. To make these actions independent of your particular operating system, python has a built-in tool kit called "os" for operating system. We imported this in the first cell and will now figure out how to use this. # # Let's learn our first operating system command, which lists the contents of a directory, **os.listdir()**. This returns a list (not in any particular order) of all the things in the directory containing this notebook: # In[12]: os.listdir() # You can ignore anything with a '.' in front of it (.DS\_Store and .ipynb\_checkpoints in this example.) # # # Another useful command is **os.mkdir()** which creates a new directory. Please note that _directory_ means the same thing as _folder_. It is just that in a graphical operating system with icons, the term _folder_ makes sense. They look like folders. Whereas to the operating system, they are traditionally referred to as _directories_. Never mind! # In[15]: os.mkdir('MYNEWDIRECTORY') # To see if that worked, list the contents again: # In[16]: os.listdir() # And sure enough, there it is. The command **os.rmdir()** deletes a directory # In[17]: os.rmdir('MYNEWDIRECTORY') # Make sure it was removed: # In[18]: os.listdir() # Yup. It's gone. # # Another handy thing is to view the contents of a file. To do this in python, we use the command **open( ).readlines( )**. This will spit the contents out for your viewing pleasure. # In[37]: open('Datasets/myfile.txt').readlines() # In[38]: contents=open('Datasets/myfile.txt').read() output=open('newfile.txt','w') # open a file for writing output.write(contents) # write the contents output.close() # close the file # So what did we create? We created a copy of **myfile.txt** called **newfile.txt**. If you repeat the command, you will overwrite the existing output file. # In[39]: open('newfile.txt').readlines() # To append to the end of a file, we use the 'a' argument instead of 'w' in the **open( )** command # In[40]: output=open('newfile.txt','a') # open a file for writing output.write(contents) # write the contents output.close() # close the file # In[41]: open('newfile.txt').readlines() # To delete a file (analogous to deleting a directory), we use the command **os.remove( )**. # In[42]: os.remove('newfile.txt') # ### Concept of path # # So far, we have just looked at directories in our working directory (the one with this notebook in it) and subdirectories within the working directory. Earlier in the lecture you were shown a figure with icons on the left and text on the right. The text to the right was a series of directories separated by '/'. These are the _paths_ to those files. A _path_ is the unique location of a file or a directory in a file system of an OS. # # Now that you know more about paths, let's take a detour and learn how to embed figures directly into a Jupyter notebook. You saw this in several lectures, but were told to ignore it. The **Image** class in the module **Ipython.display** allows us to embed many digital image types (png, jpg...) into a Jupyter notebook. If you take a look at the first cell of this lecture, we have already imported **Image** from **Ipython.display**. # # If you want to display a figure, you will use **Image** and the path to the figure. The _path_ to the figure we want to display is "Figures/FileSystem.jpg". This tells the operating system to find the folder labeled "Figures" and then grab the file inside that is labeled "FileSystem.jpg". This is a _relative path_ because the location is with respect to the directory that the notebook is in. # In[21]: Image(filename='Figures/FileSystem.jpg') # The _paths_ in this figure are _absolute paths_ which uniquely define the location of the file or directory from anywhere on the computer. The _relative paths_ are handy short cuts. For example, we can refer to a directory above the current directory without knowing what that is necessarily, we use these conventions: # # ./ is the current directory # # ../ is the one above # # ../../ is the one above that # # and so on. # # # # Instead of using 'relative' directories, it is often desirable to refer to directories in an absolute sense, i.e., relative to the _root_ directory '/'. # # To find out what the _absolute path_ for your current directory, use **os.getcwd( )** to get the current working directory: # In[54]: os.getcwd() # To find the path to your home directory, we use another Python command, **Path.home()**. to use this, we sneaklily already imported the toolkit **Path** in the first cell of this notebook so we are allset (your results will vary). # In[56]: Path.home() # And use that in the **os.listdir( )** command to get a listing of our home directory: # In[57]: os.listdir(Path.home()) # I guess I should clean up my Desktop! # To move change the name of a directory to another name, use the command **os.rename( )** (for change directory). # In[63]: os.mkdir('TEMP') print (os.listdir()) os.rename('TEMP','NEWTEMP') print (os.listdir()) os.rmdir('NEWTEMP') print (os.listdir()) # ### Command line python scripts # # As mentioned in the beginning of the lecture, you can run all the little programs you have been (and will be) writing, directly from the command line. Here's one way to do that that uses one of the many ["magic" commands] (https://ipython.readthedocs.io/en/stable/interactive/magics.html#cell-magics) that work with Jupyter notebooks. Our first is: # # %%writefile PATH_TO_FILE.py # # which writes the contents of a cell to the specified text file. # # # Running this cell will place the contents of it (without the magic command) into a file in this directory called _hello.py_. # In[64]: get_ipython().run_cell_magic('writefile', './hello.py', 'print ("Hello World!")\n') # Now you can run the program from your command line (after navigating to this directory) by typing: # # $ python hello.py # # or from within this notebook: # # In[65]: get_ipython().system('python hello.py') # Alternatively, you can use a different _magic_ command: %run to execute an external file: # In[66]: get_ipython().run_line_magic('run', 'hello.py') # The last thing you have to worry about is that the directory containing the script must be in your **PATH**. We have been talking about _paths_ (all lower case), but **PATH** is an "environment variable". So to run a program it must be in your **PATH**. And to run a Python program from anywhere, it must be in your **PYTHONPATH**. # # You can find out what your **PATH** is by using the program **os.environ\[PATH\]** # # # In[78]: os.environ['PATH'] # your results will vary! # By default, your working directory will not be in your path (some security reason), so to run a script that is in your working directory, you must either put it in your **PATH** (not recommended) or use the full path name or the relative path name, e.g., # # ./hello.py # # Changing your **PATH** depends a lot on your particular operating system and is beyond the scope of this lecture. # In[80]: #clean up a bit os.remove('hello.py')