#!/usr/bin/env python
# coding: utf-8
# # Reading and Writing Data Files with Python
# In order plot or fit data with Python, you have to get the data into the program. If a program makes calculations using data, it can be useful to write the results to a file.
# ## 1. Reading Data Files
# In Python, it is often useful for data to be in arrays. Data can be entered directly into the programs using the **`array`** function from the numpy library. For instance, the following lines assign arrays of numbers to `x`, `y`, and `yerr`.
# In[1]:
from numpy import *
x = array([0.0, 2.0, 4.0, 6.0, 8.0])
y = array([1.1, 1.9, 3.2, 4.0, 5.9])
yerr = array([0.1, 0.2, 0.1, 0.3, 0.3])
# However, this is not a good way to handle large data sets. It is better to store the data in a separate file and have the program read the data file. You could use a text editor (*Idle* works well or you can create and edit a file in *CoCalc*) to enter the data above in the form shown below. The values of `x`, `y`, and `yerr` (the uncertainty in `y`) for a single data point are entered on the same line separated by spaces or tabs.
#
# 0.0 1.1 0.1
# 2.0 1.9 0.2
# 4.0 3.2 0.1
# 6.0 4.0 0.3
# 8.0 5.9 0.3
#
#
# Suppose that the file is saved as plain text and given the name “`input.dat`”. The **`loadtxt`** function
# from the numpy library can be used to read data from the text file. The following example shows how to read the data into an array called `DataIn`.
# In[2]:
from numpy import *
DataIn = loadtxt('input.dat')
print(DataIn)
# Notice that `DataIn` is a single 2-dimensional array, rather than three 1-dimensional arrays.
#
# If you add a line that starts with a number sign (`#`) to the data file, it will be ignored as a comment when the file is read. (Blank lines are also ignored.) It is a good idea to put explanatory comments at the
# beginning of data files because you will quickly forget what the numbers mean. Giving
# files descriptive names and keeping good notes about them are also helpful.
# In[3]:
# This line is a comment, even in a data file
# In most cases (plotting, for example), each variable should be in a separate 1-dimensional
# array. Setting the **`unpack`** argument to **`True`** and providing a variable for each column
# accomplishes this.
# In[4]:
from numpy import *
x, y, yerr = loadtxt('input.dat', unpack=True)
print(x)
print(y)
print(yerr)
# If you want to read in only some columns, you can use the **`usecols`** argument to specify
# which ones. Indices in Python start from zero, not one. The line below will read only the
# first and second columns of data, so only two variable names are provided.
# In[5]:
x, y = loadtxt('input.dat', unpack=True, usecols=[0,1])
# Sometimes you will get a file with data separated by commas, instead of spaces. For example, suppose that the file "`input2.dat`" contains the following time and voltage data from a pressure sensor.
#
# 0.0, 1.1
# 2.0, 1.9
# 4.0, 4.2
# 6.0, 4.0
# 8.0, 5.9
#
#
# The **`delimiter`** argument can be used to make the **`loadtxt`** function recognize commas as the separators.
# In[6]:
t, v = loadtxt('input2.dat', delimiter=',', unpack=True)
print(t)
print(v)
# ## 2. Writing Data Files
# The **`savetxt`** function from the numpy library can be used to write data to a text file.
# Suppose that you’ve read two columns of data into the arrays `t` for time and `v` for the
# voltage from a pressure sensor. Also, suppose that the manual for the sensor gives the
# following equation to find the pressure in atmospheres from the voltage reading.
# In[7]:
p = 0.15 + v/10.0
# Recall that this single Python command will calculate an array `p` with the same length as
# the array `v`. Once you’ve calculated the pressures, you might want to write the times and
# pressures to a text file for later use. The following command will write `t` and `p` to the file
# “`output.dat`”. The file will be saved in the same directory as the program. **If you give
# the name of an existing file, it will be overwritten so be careful!**
# In[8]:
savetxt('output.dat', (t,p))
# Unfortunately, each of the arrays will appear in a different row, which is inconvenient for large data sets. The **`column_stack`** function can be used to put each array written into a different
# column. The arguments should be a list of arrays (the inner pair of brackets make it a list)
# in the order that you want them to appear.
# In[9]:
savetxt('output.dat', column_stack((t,p)) )
# The default is to write the data out separated by spaces, but you can use the optional **`delimiter`** argument to specify something else. For example, the following writes comma separated data.
# In[10]:
savetxt('output.dat', column_stack((t,p)), delimiter=',')
# By default, the numbers will be written in scientific notation. The **`fmt`** argument can be
# used to specify the formatting. If one format is supplied, it will be used for all of the
# numbers. The form of the formatting string is “`%(width).(precision)(specifier)`”, where `width` specifies the maximum number of digits, `precision` specifies the number of digits after the decimal point, and the possibilities for `specifier` are shown below. For integer formatting, the precision argument is ignored if you give it. For scientific notation and floating point formatting, the width argument is optional.
#
# |Specifier|Meaning|Example Format|Output for -34.5678|
# |-|-|-|-|
# |i|signed integer|%5i|-34|
# |e|scientific notation|%5.4e|−3.4568e+001|
# |f|floating point|%5.2f|−34.57|
#
# A format can also be provided for each column (two in this case) as follows.
# In[11]:
savetxt('output.dat', column_stack((t,p)), fmt=('%i3', '%4.3f'))
# It is a good idea to add comments at the top of data files that you create to remind you
# of what they contain. The optional **`header`** argument, which allows you put comments at the top of the text file. The **`comment`** argument allows you to pick what proceeds the header text. If you want the string to be considered a comment
# when it is read by the loadtxt function, it should start with a number sign (`#`). An example is shown below.
# In[12]:
savetxt('output.dat', column_stack((t,p)), comments='# ', header='t (s) p (Pa)')
# If you want a mulitple-line header, you can include “`\n`” to force a newline.
# In[13]:
savetxt('output.dat', column_stack((t,p)), comments='# ', header='First line\nSecond line')
# **Remember to be very careful about overwriting existing files with the `savetxt` function!**
# # Additional Documentation
# More information is available at
# http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html
# http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html
# http://docs.scipy.org/doc/numpy/reference/generated/numpy.column_stack.html