Author: Brian Stucky
shift+enter
to run the code in the cell.shift+enter
will also open a new cell below the active cell if there is not already a cell there.Writing literal values in Python: Numbers are written as, e.g., 12
or 3.141592654
, and literal text values, called strings, are written as, e.g., 'this is a string'
or "this is a string"
.
The print()
function writes output to the console.
Python provides all of the basic arithmetic operators for working with numerical values.
The =
operator is used to assign a value to a variable (and create the variable if it does not yet exist).
Python provides an if
statement that can be used to make a decision. If statements are often used with the comparison operators: >
(greater than), <
(less than), ==
(equal to), or !=
(not equal to).
If we'd like to also do something when the test is False
, we can add an else
clause to the if
statement.
Given a variable someval
that can have any real number value, write code that ensures someval
is in the range -10 to 10, inclusive, by truncating values outside of that range. E.g., if the starting value of someval
is -23, the ending value of someval
would be -10.
A Python list allows us to group multiple values together in a single data structure. We can define a list using brackets, [
and ]
.
Elements of a list are accessed using subscript notation. The first element of a list is at index 0, the next is at index 1, and so on.
Python's for
loop provides a convenient way to sequentially access every item in a list.
The indented part of a for
loop is called the loop's body, and it can contain multiple lines of code.
The len
function returns the number of items in a list.
Given a non-empty list of non-negative numbers, called num_list
, write code that uses a for
loop to find the largest item in the list.
Python code is often organized into units called packages and modules.
Use the import
statement to tell Python that you want to load a library. Once a library is loaded, the dot operator, .
, lets you access the objects contained in the library.
A function comprises a unit of code that accepts one or more arguments, does some computations using the argument values, and then returns the result.
The result of a function call can be assigned to a variable, just like any other value.
Functions can take any number of arguments. Arguments are separated by a comma, ,
.
Python allows us to assign a shortcut name for a library as part of the import
statement.
Sometimes, it is convenient to be able to access an object in a library directly without typing the library name every time. Python provides an alternative import
syntax that makes this easy.
With the help of the math library, write a short Python program to find the length of the hypotenuse of a right triangle given the lengths of the other two sides, represented by the variables a
and b
. Use the documentation for the math library as needed.
Arithmetic operations on arrays are performed element-wise.
NumPy also provides many common mathematical functions that can be used with arrays. Most of these operate element-wise, but some calculate a single value from the contents of an array.
Common statistical summary functions, such as min()
and mean()
, can also be accessed as properties of the array objects themselves, which is sometimes more convenient.
Consider the code below:
arr_1 = np.array([1, 2, 3, 4, 5, 6])
arr_2 = arr_1
arr_1[2] = 2
arr_2[3] = 5
What will be the final value of arr_1
? What will be the final value of arr_2
? Run the code and check your answers. Were you surprised by the results?
Pandas provides a structure called DataFrame
for working with tabular data. We'll work with the famous iris flower dataset, which is provided in nb-datasets/iris_dataset.csv
in comma-separated values, or CSV, format.
To inspect the contents of a DataFrame, we can use the head()
or tail()
functions.
The len()
function returns the number of rows in a dataset.
DataFrames include a function called describe()
that provides a basic statistical overview of a DataFrame.
We can access individual columns of a DataFrame using a special form of subscript notation that uses the column name.
Each column in a Pandas DataFrame is a special kind of numpy array.
Basic statistical summary methods are defined for DataFrames, too, and they return the summary statistic for each column.