Using Databases Inside Python

We will use SQLite to discuss how to use Python with databases, but most other major systems work in the same way, just with a different library.

Example database

The database we'll be using as an example has a table named Experiment. This table is a log of all of the work done on experiments in a research lab, broken down by project and scientist. The table has columns for:

  • the login id of the scientist
  • the name of their project
  • a numeric id for their experiment
  • how many hours they spent on it
  • and when it took place

Importing the library

We need to tell Python that we're going to be working with SQLite, so we start by importing the sqlite3 library. By importing using a generic name (e.g., dbapi) if we ever decide to change the database system that we're using we can simply change the library name in this single line of code (in concept; in practice subtle differences between database systems may break things):

In [1]:
import sqlite3 as dbapi

Connecting to the database

Next, we need to connect to the database using the database library's connect method. With SQLite this is as simple as first making the connect:

In [2]:
con = dbapi.connect('examp_database.sqlite')

and then getting a cursor which will let us execute SQL commands:

In [3]:
cur = con.cursor()

If the database does not currently exist then creating the connection will create it.

Querying the database

We can now execute any SQL commands that we want in the database using the cursor's execute method. Querying the database simply involves writing the appropriate SQL and placing it inside a string in the execute method call:

In [4]:
cur.execute("SELECT * FROM Experiment")
Out[4]:
<sqlite3.Cursor at 0x2792f80>

To get the results into Python we then use either the fetchone method to fetch one record at a time (it returns None when there are no more records to fetch so that you know when to stop)

In [5]:
record = cur.fetchone()
while record:
    print record
    record = cur.fetchone()
(u'best', u'Insulin', 1, 3.2, u'1922-03-23')
(u'ivan', u'Time Travel', 2, -1.5, u'1910-07-05')
(u'dian', u'Great Apes', 1, 15.0, u'1967-04-12')
(u'fban', u'Insulin', 1, 5.5, u'1922-03-23')
(u'mlom', u'Antigravity', 1, 1.5, u'1889-11-10')
(u'fban', u'Insulin', 2, 1.2, u'1922-03-24')
(u'taube', u'Redox Reaction', 2, 2.0, u'1942-12-04')
(u'ivan', u'Teleportation', 3, 7.0, u'1891-02-17')
(u'skol', u'Teleportation', 2, 14.0, u'1891-01-23')
(u'dmitri', u'Teleportation', 1, 7.0, u'1891-01-22')
(u'jane', u'Great Apes', 1, 7.0, u'1967-04-13')
(u'skol', u'Time Travel', 1, -17.5, u'1900-07-05')
(u'herschel', u'Infrared', 1, 7.0, u'1798-08-12')
(u'taube', u'Redox Reaction', 1, 5.0, u'1942-11-17')
(u'mlom', u'Teleportation', 2, 7.2, u'1891-03-23')
(u'mlom', u'Antigravity', 2, 14.3, u'1889-11-01')
(u'best', u'Insulin', 2, 10.0, u'1922-03-24')

or the fetchall method to return all of the records into a list. Each record is returned as a tuple of values, one value per field.

In [6]:
cur.execute("SELECT * FROM Experiment WHERE Project = 'Teleportation'")
myqueryrecords = cur.fetchall()
myqueryrecords
Out[6]:
[(u'ivan', u'Teleportation', 3, 7.0, u'1891-02-17'),
 (u'skol', u'Teleportation', 2, 14.0, u'1891-01-23'),
 (u'dmitri', u'Teleportation', 1, 7.0, u'1891-01-22'),
 (u'mlom', u'Teleportation', 2, 7.2, u'1891-03-23')]

The u's that appear in front of each wring indicate that they are stored in unicode. If you want the database to return regular strings you can do this using the following command.

In [7]:
con.text_factory = str

Using Python variables inside SQL

Often when working with databases in Python we will want to include variables from our Python code as part of the query. To do this we place a ? where each value should be inserted into the SQL statement and then include a tuple with the values to be inserted as a second argument in the execute statement.

In [8]:
current_experiment = 'Teleportation'
cur.execute("SELECT * FROM Experiment WHERE Project = ?", (current_experiment,))
myqueryrecords = cur.fetchall()
myqueryrecords
Out[8]:
[('ivan', 'Teleportation', 3, 7.0, '1891-02-17'),
 ('skol', 'Teleportation', 2, 14.0, '1891-01-23'),
 ('dmitri', 'Teleportation', 1, 7.0, '1891-01-22'),
 ('mlom', 'Teleportation', 2, 7.2, '1891-03-23')]

Modifying a database

Modifications can also be made to the database using the execute function. For example,

In [9]:
cur.execute("INSERT INTO Experiment VALUES ('gimli', 'TrollKilling', '1', 112, '1954-07-21')")
con.commit()

In order for the changes to actually be made to the database we need to commit them by calling con.commit() when we are ready to make the changes. This protects us if something goes wrong with our program because the changes won't be finalized until the point where we commit them.

We can check to see if the changes have been made by running a quick query.

In [10]:
cur.execute("SELECT * FROM Experiment WHERE Project = 'TrollKilling'")
myqueryrecords = cur.fetchall()
myqueryrecords
Out[10]:
[('gimli', 'TrollKilling', 1, 112.0, '1954-07-21')]

And then get rid of these changes using a DELETE statement.

In [11]:
cur.execute("DELETE FROM Experiment WHERE Project='TrollKilling'")
con.commit()
In [12]:
cur.execute("SELECT * FROM Experiment WHERE Project = 'TrollKilling'")
myqueryrecords = cur.fetchall()
myqueryrecords
Out[12]:
[]

Execute commands across many values

Often it will be useful to execute the same SQL command repeatedly with different inputs. For example, if we want to append a new set of data to an existing table, we will probably want to append multiple rows. We could do this using a for loop, but Python provides the executemany() method to make this easier.

In [13]:
new_data_records = [('mschilling', 'genetics', 1, 112.0, '2005-07-21'),
                    ('jkoch', 'bees', 1, 7.2, '2012-01-15'),
                    ('akleinhesselink', 'commecol', 1, 2.5, '2010-05-05')]

cur.executemany("INSERT INTO Experiment VALUES (?, ?, ?, ?, ?)", new_data_records)
cur.execute("SELECT * FROM Experiment")
cur.fetchall()
Out[13]:
[('best', 'Insulin', 1, 3.2, '1922-03-23'),
 ('ivan', 'Time Travel', 2, -1.5, '1910-07-05'),
 ('dian', 'Great Apes', 1, 15.0, '1967-04-12'),
 ('fban', 'Insulin', 1, 5.5, '1922-03-23'),
 ('mlom', 'Antigravity', 1, 1.5, '1889-11-10'),
 ('fban', 'Insulin', 2, 1.2, '1922-03-24'),
 ('taube', 'Redox Reaction', 2, 2.0, '1942-12-04'),
 ('ivan', 'Teleportation', 3, 7.0, '1891-02-17'),
 ('skol', 'Teleportation', 2, 14.0, '1891-01-23'),
 ('dmitri', 'Teleportation', 1, 7.0, '1891-01-22'),
 ('jane', 'Great Apes', 1, 7.0, '1967-04-13'),
 ('skol', 'Time Travel', 1, -17.5, '1900-07-05'),
 ('herschel', 'Infrared', 1, 7.0, '1798-08-12'),
 ('taube', 'Redox Reaction', 1, 5.0, '1942-11-17'),
 ('mlom', 'Teleportation', 2, 7.2, '1891-03-23'),
 ('mlom', 'Antigravity', 2, 14.3, '1889-11-01'),
 ('best', 'Insulin', 2, 10.0, '1922-03-24'),
 ('mschilling', 'genetics', 1, 112.0, '2005-07-21'),
 ('jkoch', 'bees', 1, 7.2, '2012-01-15'),
 ('akleinhesselink', 'commecol', 1, 2.5, '2010-05-05')]