SQLite with python

References:

What is SQLite?

from What is SQLite?

SQLite is a software library that provides a relational database management system. The lite in SQLite means light weight in terms of setup, database administration, and required resource.

SQLite has the following noticeable features: self-contained, serverless, zero-configuration, transactional. Serverless

Normally, an RDBMS such as MySQL, PostgreSQL, etc., requires a separate server process to operate. The applications that want to access the database server use TCP/IP protocol to send and receive requests. This is called client/server architecture.

SQLite does NOT work this way. SQLite does NOT require a server to run. SQLite database is integrated with the application that accesses the database. The applications interact with the SQLite database read and write directly from the database files stored on disk.

The following diagram illustrates the SQLite server-less architecture:

SQLite has three important features:

  • Self-Contained
  • Zero-configuration
  • Transactional

Full example

Connect To Database

Following Python code shows how to connect to an existing database. If the database does not exist, then it will be created and finally a database object will be returned.

Te database of the examples has been dowloaded from SQLite sample database

The following database diagram illustrates the chinook database tables and their relationships:

In [1]:
#!/usr/bin/python

import sqlite3

sqlite_conn = sqlite3.connect('./db/chinook.db')

print("Opened database successfully")
Opened database successfully

We show the tables in database

Show tables in sqlite database in python

In [2]:
cur = sqlite_conn.cursor() 
cur.execute("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
available_table = (cur.fetchall())
for table in available_table:
    print(table)
('albums',)
('artists',)
('customers',)
('employees',)
('genres',)
('invoice_items',)
('invoices',)
('media_types',)
('playlist_track',)
('playlists',)
('sqlite_sequence',)
('sqlite_stat1',)
('tracks',)

We show the column names in table

See the column names for table

In [3]:
cur.execute("PRAGMA table_info(tracks)")
rows = cur.fetchall()
for row in rows:
    print(row)
(0, 'TrackId', 'INTEGER', 1, None, 1)
(1, 'Name', 'NVARCHAR(200)', 1, None, 0)
(2, 'AlbumId', 'INTEGER', 0, None, 0)
(3, 'MediaTypeId', 'INTEGER', 1, None, 0)
(4, 'GenreId', 'INTEGER', 0, None, 0)
(5, 'Composer', 'NVARCHAR(220)', 0, None, 0)
(6, 'Milliseconds', 'INTEGER', 1, None, 0)
(7, 'Bytes', 'INTEGER', 0, None, 0)
(8, 'UnitPrice', 'NUMERIC(10,2)', 1, None, 0)

PRAGMA Statements

Others examples with PRAGMA statements are:

In [4]:
cur.execute("PRAGMA page_count")
print(cur.fetchall())
cur.execute("PRAGMA function_list")
print(cur.fetchall())
cur.execute("PRAGMA database_list")
print(cur.fetchall())
[(864,)]
[]
[(0, 'main', 'C:\\Users\\Ibon.DOMINIO\\Documents\\github\\miscellaneous\\SQLite&python\\db\\chinook.db')]

SELECT sentence

The SELECT statement is used to select data from a database.

The data returned is stored in a result table, called the result-set.

In [5]:
cur = sqlite_conn.execute("SELECT * FROM tracks LIMIT 5;")

rows = cur.fetchall()
for row in rows:
    print(row)
(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)
(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)
(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)
(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)
(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)
In [6]:
cur = sqlite_conn.execute("SELECT * FROM tracks LIMIT 5;")

row = cur.fetchone()
print(row)
row = cur.fetchone()
print(row)
row = cur.fetchone()
print(row)
row = cur.fetchone()
print(row)
row = cur.fetchone()
print(row)
(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)
(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)
(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)
(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)
(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)

Fetch

There are three fetch functions in python with SQLite:

  • fetchone(): This method fetches the next row of a query result set, returning a single sequence, or None when no more data is available.
  • fetchmany([size = cursor.arraysize]): This routine fetches the next set of rows of a query result, returning a list. An empty list is returned when no more rows are available. The method tries to fetch as many rows as indicated by the size parameter.
  • fetchall(): This routine fetches all (remaining) rows of a query result, returning a list. An empty list is returned when no rows are available.
In [7]:
cur = sqlite_conn.execute("SELECT * FROM tracks LIMIT 5;")

rows = cur.fetchmany(2)
print("fetchmany 2 ----------------")
for row in rows:
    print(row)

row = cur.fetchone()    
print("fetchone ----------------")
print(row)

print("fetchall ----------------")
rows = cur.fetchall()    
for row in rows:
    print(row)    
fetchmany 2 ----------------
(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)
(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)
fetchone ----------------
(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)
fetchall ----------------
(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)
(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)
In [8]:
cur = sqlite_conn.execute("""
--====
SELECT Name, Composer, Milliseconds 
  FROM tracks 
 WHERE Milliseconds >= ? 
       AND 
       Milliseconds <= ? 
 ORDER BY Milliseconds;
--====""", (200000, 201000))
rows = cur.fetchall()
for row in rows:
    print(row)
('Alabama Song', 'Weill-Brecht', 200097)
('Purgatory', 'Steve Harris', 200150)
('F**k Me Pumps', 'Salaam Remi', 200253)
('Why Go', 'Jeff Ament', 200254)
('Ice Cream Man', 'John Brim', 200306)
('Summertime', 'Miles Davis', 200437)
('Coroné Antonio Bento', None, 200437)
('Último Pau-De-Arara', 'Corumbá/José Gumarães/Venancio', 200437)
('The Wind Cries Mary', 'Jimi Hendrix', 200463)
('Over Again', 'Tim Maia', 200489)
('Rock Bottom', 'Paul Stanley, Ace Frehley', 200594)
('All My Love', 'E. Schrody/L. Dimant', 200620)
('Question!', 'Tankian, Serj', 200698)
('Posso Até Me Apaixonar', 'Dudu Nobre', 200698)
('Cold Day In The Sun', 'Dave Grohl, Taylor Hawkins, Nate Mendel, Chris Shiflett/FOO FIGHTERS', 200724)
('Too Fast For Love', 'Nikki Sixx', 200829)
('O Cidadão Do Mundo', 'Chico Science', 200933)
In [10]:
import pandas as pd
In [12]:
df = pd.read_sql_query(sql = """
--====
SELECT * 
 FROM tracks 
LIMIT 5;
--====""", con = sqlite_conn)
df
Out[12]:
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
0 1 For Those About To Rock (We Salute You) 1 1 1 Angus Young, Malcolm Young, Brian Johnson 343719 11170334 0.99
1 2 Balls to the Wall 2 2 1 None 342562 5510424 0.99
2 3 Fast As a Shark 3 2 1 F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho... 230619 3990994 0.99
3 4 Restless and Wild 3 2 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 4331779 0.99
4 5 Princess of the Dawn 3 2 1 Deaffy & R.A. Smith-Diesel 375418 6290521 0.99
In [13]:
df = pd.read_sql_query(sql = """
--====
SELECT * 
 FROM tracks 
LIMIT %d;
--====""" % (10), con = sqlite_conn)
df
Out[13]:
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
0 1 For Those About To Rock (We Salute You) 1 1 1 Angus Young, Malcolm Young, Brian Johnson 343719 11170334 0.99
1 2 Balls to the Wall 2 2 1 None 342562 5510424 0.99
2 3 Fast As a Shark 3 2 1 F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho... 230619 3990994 0.99
3 4 Restless and Wild 3 2 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 4331779 0.99
4 5 Princess of the Dawn 3 2 1 Deaffy & R.A. Smith-Diesel 375418 6290521 0.99
5 6 Put The Finger On You 1 1 1 Angus Young, Malcolm Young, Brian Johnson 205662 6713451 0.99
6 7 Let's Get It Up 1 1 1 Angus Young, Malcolm Young, Brian Johnson 233926 7636561 0.99
7 8 Inject The Venom 1 1 1 Angus Young, Malcolm Young, Brian Johnson 210834 6852860 0.99
8 9 Snowballed 1 1 1 Angus Young, Malcolm Young, Brian Johnson 203102 6599424 0.99
9 10 Evil Walks 1 1 1 Angus Young, Malcolm Young, Brian Johnson 263497 8611245 0.99
In [14]:
df = pd.read_sql_query(sql = """
SELECT Name, Composer, Milliseconds 
  FROM tracks 
 WHERE Milliseconds >= %d 
       AND 
       Milliseconds <= %d
 ORDER BY Milliseconds;
--====""" % (200000, 201000), con = sqlite_conn)
df
Out[14]:
Name Composer Milliseconds
0 Alabama Song Weill-Brecht 200097
1 Purgatory Steve Harris 200150
2 F**k Me Pumps Salaam Remi 200253
3 Why Go Jeff Ament 200254
4 Ice Cream Man John Brim 200306
5 Summertime Miles Davis 200437
6 Coroné Antonio Bento None 200437
7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio 200437
8 The Wind Cries Mary Jimi Hendrix 200463
9 Over Again Tim Maia 200489
10 Rock Bottom Paul Stanley, Ace Frehley 200594
11 All My Love E. Schrody/L. Dimant 200620
12 Question! Tankian, Serj 200698
13 Posso Até Me Apaixonar Dudu Nobre 200698
14 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... 200724
15 Too Fast For Love Nikki Sixx 200829
16 O Cidadão Do Mundo Chico Science 200933

Join

A JOIN clause is used to combine rows from two or more tables, based on a related column between them.

Different Types of SQL JOINs

Here are the different types of the JOINs in SQL:

  • (INNER) JOIN: Returns records that have matching values in both tables

  • LEFT (OUTER) JOIN: Return all records from the left table, and the matched records from the right table

  • RIGHT (OUTER) JOIN: Return all records from the right table, and the matched records from the left table

  • FULL (OUTER) JOIN: Return all records when there is a match in either left or right table

In the following sentence we will join three tables: tracks, albums and artists. The result will be a table with the name of the artis/group, the title of the album and the name of the track. We will sort teh results by artis and album.

In [33]:
cur = sqlite_conn.execute("""
--==== ====================================================
--==== INICIO DE LA QUERY
SELECT artists.Name, albums.Title, tracks.Name
  FROM tracks
  LEFT JOIN albums ON albums.Albumid = tracks.Albumid
  LEFT JOIN artists ON albums.Artistid = artists.Artistid
 ORDER BY artists.Name, albums.Title ASC;
--==== FIN DE LA QUERY
--==== ====================================================  
""")
rows = cur.fetchmany(20) # Only the 20 first results
for row in rows:
    print(row)
rows = cur.fetchall() 
('AC/DC', 'For Those About To Rock We Salute You', 'For Those About To Rock (We Salute You)')
('AC/DC', 'For Those About To Rock We Salute You', 'Put The Finger On You')
('AC/DC', 'For Those About To Rock We Salute You', "Let's Get It Up")
('AC/DC', 'For Those About To Rock We Salute You', 'Inject The Venom')
('AC/DC', 'For Those About To Rock We Salute You', 'Snowballed')
('AC/DC', 'For Those About To Rock We Salute You', 'Evil Walks')
('AC/DC', 'For Those About To Rock We Salute You', 'C.O.D.')
('AC/DC', 'For Those About To Rock We Salute You', 'Breaking The Rules')
('AC/DC', 'For Those About To Rock We Salute You', 'Night Of The Long Knives')
('AC/DC', 'For Those About To Rock We Salute You', 'Spellbound')
('AC/DC', 'Let There Be Rock', 'Go Down')
('AC/DC', 'Let There Be Rock', 'Dog Eat Dog')
('AC/DC', 'Let There Be Rock', 'Let There Be Rock')
('AC/DC', 'Let There Be Rock', 'Bad Boy Boogie')
('AC/DC', 'Let There Be Rock', 'Problem Child')
('AC/DC', 'Let There Be Rock', 'Overdose')
('AC/DC', 'Let There Be Rock', "Hell Ain't A Bad Place To Be")
('AC/DC', 'Let There Be Rock', 'Whole Lotta Rosie')
('Aaron Copland & London Symphony Orchestra', 'A Copland Celebration, Vol. I', 'Fanfare for the Common Man')
('Aaron Goldberg', 'Worlds', "OAM's Blues")

A example with analysis. We will calculate the time average by album in milliseconds.

In [35]:
cur = sqlite_conn.execute("""
--==== ====================================================
--==== INICIO DE LA QUERY
SELECT artists.Name, albums.Title, tracks.Name, ROUND(AVG(tracks.Milliseconds),0)
  FROM tracks -- TABLA PRINCIPAL
  LEFT JOIN albums ON albums.Albumid = tracks.Albumid
  LEFT JOIN artists ON albums.Artistid = artists.Artistid
 GROUP BY artists.Name, albums.Title
 ORDER BY artists.Name, albums.Title ASC;
--==== FIN DE LA QUERY
--==== ====================================================  
""")
rows = cur.fetchmany(10)
for row in rows:
    print(row)
rows = cur.fetchall()
('AC/DC', 'For Those About To Rock We Salute You', 'Spellbound', 240042.0)
('AC/DC', 'Let There Be Rock', 'Whole Lotta Rosie', 306657.0)
('Aaron Copland & London Symphony Orchestra', 'A Copland Celebration, Vol. I', 'Fanfare for the Common Man', 198064.0)
('Aaron Goldberg', 'Worlds', "OAM's Blues", 266936.0)
('Academy of St. Martin in the Fields & Sir Neville Marriner', 'The World of Classical Favourites', 'Fantasia On Greensleeves', 232601.0)
('Academy of St. Martin in the Fields Chamber Ensemble & Sir Neville Marriner', 'Sir Neville Marriner: A Celebration', '"Eine Kleine Nachtmusik" Serenade In G, K. 525: I. Allegro', 348971.0)
('Academy of St. Martin in the Fields, John Birch, Sir Neville Marriner & Sylvia McNair', 'Fauré: Requiem, Ravel: Pavane & Others', 'Requiem, Op.48: 4. Pie Jesu', 258924.0)
('Academy of St. Martin in the Fields, Sir Neville Marriner & Thurston Dart', 'Bach: Orchestral Suites Nos. 1 - 4', 'Suite No. 3 in D, BWV 1068: III. Gavotte I & II', 225933.0)
('Accept', 'Balls to the Wall', 'Balls to the Wall', 342562.0)
('Accept', 'Restless and Wild', 'Princess of the Dawn', 286029.0)

We will calculate the number of tracks in each album.

In [34]:
cur = sqlite_conn.execute("""
--==== ====================================================
--==== INICIO DE LA QUERY
SELECT artists.Name, albums.Title, COUNT(tracks.Name) AS N
  FROM tracks
  LEFT JOIN albums ON albums.Albumid = tracks.Albumid
  LEFT JOIN artists ON albums.Artistid = artists.Artistid
 GROUP BY artists.Name, albums.Title
 ORDER BY N DESC;
--==== FIN DE LA QUERY
--==== ====================================================  
""")
rows = cur.fetchmany(10)
for row in rows:
    print(row)
rows = cur.fetchall()
('Lenny Kravitz', 'Greatest Hits', 57)
('Chico Buarque', 'Minha Historia', 34)
('Eric Clapton', 'Unplugged', 30)
('Lost', 'Lost, Season 3', 26)
('Lost', 'Lost, Season 1', 25)
('The Office', 'The Office, Season 3', 25)
('Battlestar Galactica (Classic)', 'Battlestar Galactica (Classic), Season 1', 24)
('Frank Sinatra', 'My Way: The Best Of Frank Sinatra [Disc 1]', 24)
('Lost', 'Lost, Season 2', 24)
('Chico Science & Nação Zumbi', 'Afrociberdelia', 23)

SQL statement from file

In the previous examples we have seen SQL codes very large, with comments and several analysis. If is a code that will be used in several examples or analysis, then we will be more efficient if we will save the SQL code in a file.

Example 1

If the SQL statement is very large then is a better way to save the SQL in a file *.sql and load it in the read_sql_query. In the file example.sql is written the following SQL code.

--====
SELECT Name, Composer, Milliseconds 
  FROM tracks 
 WHERE Milliseconds >= 200000
       AND 
       Milliseconds <= 201000
 ORDER BY Milliseconds;
--====
In [24]:
f = open("example.sql")
sql_statement = f.read()
f.close()
In [25]:
print(sql_statement)
--====
SELECT Name, Composer, Milliseconds 
  FROM tracks 
 WHERE Milliseconds >= 200000 
       AND 
       Milliseconds <= 201000
 ORDER BY Milliseconds;
--====

In [26]:
df = pd.read_sql_query(sql = sql_statement, 
                       con = sqlite_conn)
df
Out[26]:
Name Composer Milliseconds
0 Alabama Song Weill-Brecht 200097
1 Purgatory Steve Harris 200150
2 F**k Me Pumps Salaam Remi 200253
3 Why Go Jeff Ament 200254
4 Ice Cream Man John Brim 200306
5 Summertime Miles Davis 200437
6 Coroné Antonio Bento None 200437
7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio 200437
8 The Wind Cries Mary Jimi Hendrix 200463
9 Over Again Tim Maia 200489
10 Rock Bottom Paul Stanley, Ace Frehley 200594
11 All My Love E. Schrody/L. Dimant 200620
12 Question! Tankian, Serj 200698
13 Posso Até Me Apaixonar Dudu Nobre 200698
14 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... 200724
15 Too Fast For Love Nikki Sixx 200829
16 O Cidadão Do Mundo Chico Science 200933

Example 2

If the SQL statement is very large then is a better way to save the SQL in a file *.sql and load it in the read_sql_query. In the file example2.sql is written the following SQL code.

--====
SELECT Name, Composer, Milliseconds 
  FROM %s 
 WHERE Milliseconds >= %d
       AND 
       Milliseconds <= %d
 ORDER BY Milliseconds;
--====

We can see the %s (string) and %d (integer) insertions. These parameters allow the reutilitation of the SQL sentence.

In [27]:
f = open("example2.sql")
sql_statement = f.read()
f.close()
In [30]:
df = pd.read_sql_query(sql = sql_statement % ('tracks', 200000, 200500), 
                       con = sqlite_conn)
df
Out[30]:
Name Composer Milliseconds
0 Alabama Song Weill-Brecht 200097
1 Purgatory Steve Harris 200150
2 F**k Me Pumps Salaam Remi 200253
3 Why Go Jeff Ament 200254
4 Ice Cream Man John Brim 200306
5 Summertime Miles Davis 200437
6 Coroné Antonio Bento None 200437
7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio 200437
8 The Wind Cries Mary Jimi Hendrix 200463
9 Over Again Tim Maia 200489
In [31]:
df = pd.read_sql_query(sql = sql_statement % ('tracks', 200500, 201000), 
                       con = sqlite_conn)
df
Out[31]:
Name Composer Milliseconds
0 Rock Bottom Paul Stanley, Ace Frehley 200594
1 All My Love E. Schrody/L. Dimant 200620
2 Question! Tankian, Serj 200698
3 Posso Até Me Apaixonar Dudu Nobre 200698
4 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... 200724
5 Too Fast For Love Nikki Sixx 200829
6 O Cidadão Do Mundo Chico Science 200933

Close de connect

We close the conexion with database with the following command:

In [15]:
sqlite_conn.close()