The input
function is useful to get string prompt from the user. It works in the notebook, as well as when running scripts in the console.
name = input("What's your name?\n")
print("Hi", name)
What's your name? Yoav Hi Yoav
n_icecreams = input("How many icecreams would you like?")
price = input("How much does an icecream cost?")
print("That would be", price * n_icecreams)
How many icecreams would you like?3 How much does an icecream cost?1.5
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-3e01a711260d> in <module>() 1 n_icecreams = input("How many icecreams would you like?") 2 price = input("How much does an icecream cost?") ----> 3 print("That would be", price * n_icecreams) TypeError: can't multiply sequence by non-int of type 'str'
For security reasons, input
returns strings. It is the program's responsibility to convert the string to the desired type:
n_icecreams = int(input("How many icecreams would you like?"))
price = float(input("How much does an icecream cost?"))
print("That would be", price * n_icecreams)
How many icecreams would you like?3 How much does an icecream cost?1.5 That would be 4.5
You can use eval
to evaluate the input string into a Python expression, but don't do it if you don't trust the user because it can lead to strange behaviour and side effects.
Let's see what happens when we give valud input (2
and 1.5
) and when we give invalid input (2
and [1,2,3]
). Try it with eval
and with the above code(int
and float
).
n_icecreams = eval(input("How many icecreams would you like?"))
price = eval(input("How much does an icecream cost?"))
print("That would be", price * n_icecreams)
How many icecreams would you like?3 How much does an icecream cost?1.5 That would be 4.5
Ask the user for a number between 1 and 10; if the number is not within that range, let him know and ask him again.
We'll start with simple text files and proceed to more complex formats.
Let's read the list of crop plants located in data/crops.txt
or you can download it from GitHub.
Whenever we want to work with a file, we first need to open it using the open
function.
This function returns an IO object which we can then use for reading or writing.
f = open('../data/crops.txt', 'rt') # rt = read text
print(type(f))
<class '_io.TextIOWrapper'>
crops = f.read()
f.close()
print(crops[:100])
Abelmoschus caillei Abelmoschus esculentus Acacia mearnsii Acacia senegal Acacia seyal Acca sellowia
The open
function receives two parameters:
r
for reading, w
for writing, a
for appending, t
for text, b
for binary.read
returns all the text from the file as a string.
close
then closes the file handle.
A more idiomatic way to do this, in which Python takes care of closing the file, is using a context manager:
with open('../data/crops.txt','r') as f:
crops = f.read()
print(crops[:100])
Abelmoschus caillei Abelmoschus esculentus Acacia mearnsii Acacia senegal Acacia seyal Acca sellowia
This idiom uses a context manager, and the file handle f
is closed when the context manager block ends, even if it ends due to an error.
with open('../data/crops.txt','r') as f:
for line in f:
if line.startswith('Musa'): # check if line starts with a given string
print(line.strip()) # strip removes the newline character from the end of the line
Musa balbisiana Musa spp. Musa textilis
readline
¶The readline()
method allows us to read a single line each time.
It works well when combined with a while
loop, giving us control of the program flow.
with open('../data/crops.txt','r') as f:
line = f.readline().strip() # read first line
print(line)
while line:
line = f.readline().strip()
if line.startswith('Triticum'):
print(line)
Abelmoschus caillei Triticum aestivum Triticum dicoccum Triticum durum Triticum monococcum Triticum spelta Triticum turanicum
There are other methods you can use to read files. For example, the readlines()
returns all the lines as a list of strings.
startswith()
string method).
To write to a file, we first have to open it for writing. This is done using one of two modes: 'w' or 'a'.
'w', for write, will let you write into the file. If it doesn't exist, it'll be automatically created. If it exists and already has some content, the content will be overwritten.
'a', for append, is very similar, only it will not overwrite, but append your text to the end of an existing file.
Writing is done using print()
by adding the argument file = <file object>
.
with open(r'tmp.txt','w') as f:
print('This is the first line', file=f)
line = 'Another line'
print(line, file=f)
msg1 = 'Hello '
msg2 = 'World!'
print(msg1 + msg2, file=f)
%less tmp.txt
Temporary files can be created using the tempfile module:
import tempfile
_, fname = tempfile.mkstemp()
print("Writing to temp file", fname)
with open(fname, 'w') as f:
print("This is a temporary file", file=f)
Writing to temp file /var/folders/qn/3hj7mcx56k19b_09n6dymw8h0000gn/T/tmp5mvzb1dr
%less $fname
See other methods in tempfile on how to create temporary directories, named temporary files, etc.
In the last example we wrote to a temporary file. In this exercise we will copy that file contents to a new temporary file that has an extension .txt
(use the suffix
keyword when creating the temporary file). Copy the contents by reading from the existing file and writing to a new file (this is not the efficient way to do it, but it's just an exercise!). Don't forget to close the files and print the new temporary filename so that you can check that the writing was successful.
C:\Users\yoavram\AppData\Local\Temp\tmpk6qauup4.txt
Python offers plenty of ways to interact with the filesystem through the os
and os.path
modules.
Let's import os
:
import os
Showcase some of the capabilities of os
:
files = os.listdir()
for fname in files:
if os.path.isdir(fname):
print(fname, "is a folder")
elif os.path.isfile(fname):
size = os.path.getsize(fname)
print(fname, "is a file with size", size, "bytes")
.ipynb_checkpoints is a folder async.ipynb is a file with size 5560 bytes calculus.ipynb is a file with size 101282 bytes conda-env.ipynb is a file with size 12529 bytes csv.ipynb is a file with size 5547 bytes curve-fitting.ipynb is a file with size 513763 bytes dictionaries.ipynb is a file with size 17682 bytes differential-equations.ipynb is a file with size 280735 bytes DSP.ipynb is a file with size 1894434 bytes exceptions.ipynb is a file with size 30828 bytes functions.ipynb is a file with size 17818 bytes gui.ipynb is a file with size 5995 bytes idioms.ipynb is a file with size 40620 bytes if-while.ipynb is a file with size 10170 bytes image-processing.ipynb is a file with size 2681392 bytes img is a folder io.ipynb is a file with size 29051 bytes iteration.ipynb is a file with size 40219 bytes linear-algebra.ipynb is a file with size 90527 bytes matplotlib-aesthetics.ipynb is a file with size 454243 bytes matplotlib.ipynb is a file with size 155736 bytes memory-model.ipynb is a file with size 13114 bytes ML.ipynb is a file with size 148262 bytes modules.ipynb is a file with size 23036 bytes notebook-display.ipynb is a file with size 272487 bytes notebook-magic.ipynb is a file with size 24575 bytes numpy.ipynb is a file with size 59320 bytes oop.ipynb is a file with size 77898 bytes optimization.ipynb is a file with size 550354 bytes pandas-seaborn.ipynb is a file with size 425639 bytes probability.ipynb is a file with size 304972 bytes regexp.ipynb is a file with size 32066 bytes requests.ipynb is a file with size 100499 bytes statistics.ipynb is a file with size 666372 bytes strings-lists-loops.ipynb is a file with size 44695 bytes types-operators.ipynb is a file with size 28509 bytes
Here's a combination of functions to get the current directory (os.getcwd
), change the directory (os.chdir
), check if a file exists (os.path.exists
), and split a filename from its extension:
curdir = os.getcwd()
os.chdir('../data')
fname = 'crops.txt'
print(fname, 'exists?', os.path.exists(fname))
fname = os.path.splitext('crops.txt')[0] + '.csv'
print(fname, 'exists?', os.path.exists(fname))
os.chdir(curdir)
crops.txt exists? True crops.csv exists? False
The json
module allows to encode Python objects to text and decode them back again. It implements the JSON (JavaScript Object Notation) format, a lightweight data interchange format inspired by JavaScript object literal syntax, and is therefore interoperable and widely used outside of the Python ecosystem. Also, the format is human-readable, which allows the developer to inspect the data from file without requiring him to deserialize the data.
We start by importing the module and creating an example data dictionary:
import json
data = {
'a_string': 'Hello JSON',
'ints_in_a_tuple': (5, 6, 7, 2, 3, 5, 6),
'some_number': 5768.4454,
'list_as_well': [True, False, 'This', 'That']
}
data
{'a_string': 'Hello JSON', 'ints_in_a_tuple': (5, 6, 7, 2, 3, 5, 6), 'list_as_well': [True, False, 'This', 'That'], 'some_number': 5768.4454}
We dump the dictionary into a string:
data_string = json.dumps(data)
data_string
'{"a_string": "Hello JSON", "ints_in_a_tuple": [5, 6, 7, 2, 3, 5, 6], "some_number": 5768.4454, "list_as_well": [true, false, "This", "That"]}'
If we want to save this to a file, we can either write the string to a file or dump directly to a file:
fname = tempfile.mktemp(suffix='.json')
with open(fname, 'w') as f:
json.dump(data, f)
%less $fname
We can make the file more readable with some configuration:
_, fname = tempfile.mkstemp(suffix='.json')
with open(fname, 'w') as f:
json.dump(data, f, sort_keys=True, indent=4, separators=(',', ': '))
%less $fname
Not everything is supported by json
, for example, complex
numbers:
json.dumps([1 + 2j, 4 + 5j])
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-7-e209f240b279> in <module>() ----> 1 json.dumps([1 + 2j, 4 + 5j]) /Users/yoavram/miniconda3/envs/Py4Eng/lib/python3.6/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw) 229 cls is None and indent is None and separators is None and 230 default is None and not sort_keys and not kw): --> 231 return _default_encoder.encode(obj) 232 if cls is None: 233 cls = JSONEncoder /Users/yoavram/miniconda3/envs/Py4Eng/lib/python3.6/json/encoder.py in encode(self, o) 197 # exceptions aren't as detailed. The list call should be roughly 198 # equivalent to the PySequence_Fast that ''.join() would do. --> 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): 201 chunks = list(chunks) /Users/yoavram/miniconda3/envs/Py4Eng/lib/python3.6/json/encoder.py in iterencode(self, o, _one_shot) 255 self.key_separator, self.item_separator, self.sort_keys, 256 self.skipkeys, _one_shot) --> 257 return _iterencode(o, 0) 258 259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, /Users/yoavram/miniconda3/envs/Py4Eng/lib/python3.6/json/encoder.py in default(self, o) 178 """ 179 raise TypeError("Object of type '%s' is not JSON serializable" % --> 180 o.__class__.__name__) 181 182 def encode(self, o): TypeError: Object of type 'complex' is not JSON serializable
def encode_complex(obj):
if isinstance(obj, complex):
return {'real': obj.real, 'imag': obj.imag}
data = [1 + 2j, 4 + 5j, 5]
dump = json.dumps(data, default=encode_complex)
dump
'[{"real": 1.0, "imag": 2.0}, {"real": 4.0, "imag": 5.0}, 5]'
And to decode:
def decode_complex(o):
if 'real' in o and 'imag' in o: # no need for isinstance(o, dict) as o is always dict, see docstring
return complex(o['real'], o['imag'])
return o
data2 = json.loads(dump, object_hook=decode_complex)
print(data2, data2 == data)
[(1+2j), (4+5j), 5] True
pickle
API is similar to that of json
.This notebook was written by Yoav Ram and is part of the Python for Engineers course.
The notebook was written using Python 3.6.1. Dependencies listed in environment.yml, full versions in environment_full.yml.
This work is licensed under a CC BY-NC-SA 4.0 International License.