# Digital Musicology Tutorium Week 4: MIDI Data Explorations¶

In [23]:
# import packages
using DigitalMusicology
using DataFrames
using Plots


## 1) Convert Frequencies to Pitches and vice verca¶

In [24]:
# A4 is set to 440Hz
frequency_to_pitch(f) = 69 + 12 * log2(f / 440)

pitch_to_frequency(p) = 2^((p - 69) / 12) * 440

Out[24]:
pitch_to_frequency (generic function with 1 method)
In [25]:
# the frequency of C4
pitch_to_frequency(60)

Out[25]:
261.6255653005986
In [26]:
# the lowest midi note has a frequency of about 8Hz,
# humans typically hear between 20Hz and 20kHz
pitch_to_frequency(0), pitch_to_frequency(127)

Out[26]:
(8.175798915643707, 12543.853951415975)

## 2) Example of this Weeks Tutorial: Iron Maiden Run to the hills¶

find midi file here

The midi file can also be read by Garageband and MuseScore!

## 3) Read MIDI File as a DataFrame¶

• A MIDI file is essentially a list of MIDI events / MIDI commands
• There are of are note-on, note-off, and meta events grouped by channels (repesenting voices) that are grouped by tracks (representing instruments)
• For note events, pitch and velocity are specified
• Meta events e.g. specify the key of a piece or indicate tempo changes
• To work with a MIDI file, we convert it into a list of notes
In [27]:
notes = midifilenotes("Run_To_The_Hills.mid")

Out[27]:
onset_ticksoffset_ticksonset_wholesoffset_wholesonset_secsoffset_secspitchvelocitytrackchannelkey_sharpskey_major
10460//123//3840.00.1056984166666666742110790true
201600//15//240.00.367646666666666736110790true
348941//1647//3840.1102940.215992416666666742110790true
4961421//871//3840.2205880.326286416666666742110790true
51441903//1695//3840.3308820.436580416666666742110790true
61922381//4119//3840.4411760.546874416666666742110790true
• key_sharps and key_major have default values
• Be careful if you see the default setting (0,true) like here!
In [28]:
notes[:key_sharps] |> unique

Out[28]:
1-element DataArrays.DataArray{Int64,1}:
0
In [29]:
?midifilenotes

search: midifilenotes


Out[29]:
midifilenotes(file; warnings=false, overlaps=:queue, orphans=:skip)

Reads a midi file and returns a DataFrame with one row per note. On- and offset times are given in ticks, whole notes, and seconds. The data frame has the following columns:

• onset_ticks (Int)
• offset_ticks (Int)
• onset_wholes (Rational{Int})
• offset_wholes (Rational{Int})
• onset_secs (Rational{Int})
• offset_secs (Rational{Int})
• pitch (MidiPitch)
• velocity (Int)
• channel (Int)
• track (Int)
• key_sharps (Int)
• key_major (Bool)

If warnings is true, warnings about encoding errors will be displayed. If two notes overlap on the same channel and track (e.g. two ons, then two offs for the same pitch) overlaps provides the strategy for interpreting the sequence of on and off events:

• :queue matches ons and offs in a FIFO manner (first on to first off).
• :stack matches ons and offs in a LIFO manner (first on to last off).

orphans determines what happens to on and off events without counterpart. Currently, its value is ignored and orphan events are always skipped.

In [30]:
# this is a small pipeline
notes[:track] |> unique |> sort

Out[30]:
6-element DataArrays.DataArray{Int64,1}:
2
3
4
5
6
7
In [31]:
# it is equivalent to
sort(unique(notes[:track]))

Out[31]:
6-element DataArrays.DataArray{Int64,1}:
2
3
4
5
6
7
In [32]:
track_names = ["meta", "vocals", "vocal harmony", "guitar 1", "guitar 2", "bass", "drums"]

Out[32]:
7-element Array{String,1}:
"meta"
"vocals"
"vocal harmony"
"guitar 1"
"guitar 2"
"bass"
"drums"        
• Track 1 is reserved for meta events!
• The other tracks are for vocals, vocal harmony, guitar 1, guitar 2, bass, drums
• one row in the data frame represents one note
• the columns are the features that we know about the notes

## 4) A look at the first bar¶

In [33]:
# drums in first bar
# there are 192 ticks per quarter note
notes[1:20, [:onset_wholes, :onset_ticks]]

Out[33]:
onset_wholesonset_ticks
10//10
20//10
31//1648
41//896
53//16144
61//4192
71//4192
85//16240
93//8288
107//16336
111//2384
121//2384
139//16432
145//8480
1511//16528
163//4576
173//4576
187//8672
191//1768
201//1768

## 5) Exploratory Analysis¶

1. Pitch histogram
2. Pitch class histogram
3. Pitch class histogram per instrument
4. Note duration histogram per instrument
5. Beat histogram (onsets per beat) per instrument

## 5.1) Plot a pitch histogram¶

In [34]:
# select the the instruments other than the drums
not_percussive_notes = notes[notes[:track] .!= 7, :]

Out[34]:
onset_ticksoffset_ticksonset_wholesoffset_wholesonset_secsoffset_secspitchvelocitytrackchannelkey_sharpskey_major
12976321431//81607//3846.8382287.38510241666666786110420true
22976307031//81535//3846.8382287.05422041666666662110530true
32976307031//81535//3846.8382287.05422041666666669110530true
42976307031//81535//3846.8382287.05422041666666674110530true
52976307031//81535//3846.8382287.05422041666666638110640true
6307232144//11607//3847.0588167.38510241666666764110530true
In [35]:
# convert midi notes to pitch numbers
pitches = [note.pitch for note in not_percussive_notes[:pitch]]
pitches[1:10]

Out[35]:
10-element Array{Int64,1}:
86
62
69
74
38
64
71
76
40
86
In [36]:
# plot pitch histogram
histogram(pitches, bins=collect(0:127))

Out[36]:

## 5.2) Plot a pitch class histogram¶

In [37]:
# plot pitch class histogram
pitch_classes = [mod(p, 12) for p in pitches]
xticks = (collect(0:11) .+ 0.5, collect(0:11))
histogram(pitch_classes, bins=12, xticks=xticks)

Out[37]:

Looks like a C major scale!

In [38]:
# shift the histogram so that the most prominent note is in front
shifted_xticks = (collect(0:11) .+ 0.5, [mod(pc+7, 12) for pc in 0:11])
histogram(
[mod(pc-7, 12) for pc in pitch_classes],
bins=12,
xticks=shifted_xticks
)

Out[38]:
In [39]:
transform(p) = mod((p+4)*7, 12)
histogram(
transform.(pitch_classes),
bins   = 12,
xticks = (transform.(collect(0:11)) .+ 0.5, (collect(0:11))),
xlim   = (0,12)
)

Out[39]:

## 5.3) Plot a Pitch Class Histogram per Instrument¶

In [40]:
for track in 2:6
display(
histogram(
[mod(note.pitch, 12) for note in notes[notes[:track] .== track, :pitch]],
bins   = collect(0:12),
xticks = xticks,
title  = track_names[track],
xlim = (0,12)
)
)
end


## 5.4) Plot a Note Duration Histogram per Instrument¶

In [41]:
onsets  = notes[notes[:track] .== 4, :onset_wholes]
offsets = notes[notes[:track] .== 4, :offset_wholes]
durations = offsets - onsets

Out[41]:
1691-element DataArrays.DataArray{Rational{Int64},1}:
119//384
71//384
9//16
71//384
71//384
9//16
71//384
71//384
9//16
71//384
71//384
9//16
71//384
â‹®
5//128
5//128
5//128
5//128
5//128
5//128
263//384
263//384
263//384
5//24
5//24
5//24 
In [42]:
[(d.num + 1) // d.den for d in durations]

Out[42]:
1691-element Array{Rational{Int64},1}:
5//16
3//16
5//8
3//16
3//16
5//8
3//16
3//16
5//8
3//16
3//16
5//8
3//16
â‹®
3//64
3//64
3//64
3//64
3//64
3//64
11//16
11//16
11//16
1//4
1//4
1//4 
In [43]:
for track in 2:6
onsets  = notes[notes[:track] .== track, :onset_wholes]
offsets = notes[notes[:track] .== track, :offset_wholes]
durations = offsets - onsets
display(histogram([(d.num + 1) // d.den for d in durations], title=track_names[track]))
end


## 5.5) Plot beat histogram (onsets per beat) per instrument¶

In [44]:
for track in 2:6
onsets  = notes[notes[:track] .== track, :onset_wholes]
display(
histogram(
[o - floor(o) for o in onsets],
bins  = [x*1/16 for x in 0:16],
title = track_names[track]
)
)
end