Digital Musicology Tutorium Week 4: MIDI Data Explorations

In [23]:
# import packages
using DigitalMusicology
using DataFrames
using Plots

1) Convert Frequencies to Pitches and vice verca

In [24]:
# A4 is set to 440Hz
frequency_to_pitch(f) = 69 + 12 * log2(f / 440)

pitch_to_frequency(p) = 2^((p - 69) / 12) * 440
Out[24]:
pitch_to_frequency (generic function with 1 method)
In [25]:
# the frequency of C4
pitch_to_frequency(60)
Out[25]:
261.6255653005986
In [26]:
# the lowest midi note has a frequency of about 8Hz,
# the highest about 13kHz
# humans typically hear between 20Hz and 20kHz
pitch_to_frequency(0), pitch_to_frequency(127)
Out[26]:
(8.175798915643707, 12543.853951415975)

2) Example of this Weeks Tutorial: Iron Maiden Run to the hills

song on youtube

find midi file here

The midi file can also be read by Garageband and MuseScore!

3) Read MIDI File as a DataFrame

  • A MIDI file is essentially a list of MIDI events / MIDI commands
  • There are of are note-on, note-off, and meta events grouped by channels (repesenting voices) that are grouped by tracks (representing instruments)
  • For note events, pitch and velocity are specified
  • Meta events e.g. specify the key of a piece or indicate tempo changes
  • To work with a MIDI file, we convert it into a list of notes
In [27]:
notes = midifilenotes("Run_To_The_Hills.mid")
head(notes)
Out[27]:
onset_ticksoffset_ticksonset_wholesoffset_wholesonset_secsoffset_secspitchvelocitytrackchannelkey_sharpskey_major
10460//123//3840.00.1056984166666666742110790true
201600//15//240.00.367646666666666736110790true
348941//1647//3840.1102940.215992416666666742110790true
4961421//871//3840.2205880.326286416666666742110790true
51441903//1695//3840.3308820.436580416666666742110790true
61922381//4119//3840.4411760.546874416666666742110790true
  • key_sharps and key_major have default values
  • Be careful if you see the default setting (0,true) like here!
In [28]:
notes[:key_sharps] |> unique
Out[28]:
1-element DataArrays.DataArray{Int64,1}:
 0
In [29]:
?midifilenotes
search: midifilenotes

Out[29]:
midifilenotes(file; warnings=false, overlaps=:queue, orphans=:skip)

Reads a midi file and returns a DataFrame with one row per note. On- and offset times are given in ticks, whole notes, and seconds. The data frame has the following columns:

  • onset_ticks (Int)
  • offset_ticks (Int)
  • onset_wholes (Rational{Int})
  • offset_wholes (Rational{Int})
  • onset_secs (Rational{Int})
  • offset_secs (Rational{Int})
  • pitch (MidiPitch)
  • velocity (Int)
  • channel (Int)
  • track (Int)
  • key_sharps (Int)
  • key_major (Bool)

If warnings is true, warnings about encoding errors will be displayed. If two notes overlap on the same channel and track (e.g. two ons, then two offs for the same pitch) overlaps provides the strategy for interpreting the sequence of on and off events:

  • :queue matches ons and offs in a FIFO manner (first on to first off).
  • :stack matches ons and offs in a LIFO manner (first on to last off).

orphans determines what happens to on and off events without counterpart. Currently, its value is ignored and orphan events are always skipped.

In [30]:
# this is a small pipeline
notes[:track] |> unique |> sort
Out[30]:
6-element DataArrays.DataArray{Int64,1}:
 2
 3
 4
 5
 6
 7
In [31]:
# it is equivalent to
sort(unique(notes[:track]))
Out[31]:
6-element DataArrays.DataArray{Int64,1}:
 2
 3
 4
 5
 6
 7
In [32]:
track_names = ["meta", "vocals", "vocal harmony", "guitar 1", "guitar 2", "bass", "drums"]
Out[32]:
7-element Array{String,1}:
 "meta"         
 "vocals"       
 "vocal harmony"
 "guitar 1"     
 "guitar 2"     
 "bass"         
 "drums"        
  • Track 1 is reserved for meta events!
  • The other tracks are for vocals, vocal harmony, guitar 1, guitar 2, bass, drums
  • one row in the data frame represents one note
  • the columns are the features that we know about the notes

4) A look at the first bar

In [33]:
# drums in first bar
# there are 192 ticks per quarter note
notes[1:20, [:onset_wholes, :onset_ticks]]
Out[33]:
onset_wholesonset_ticks
10//10
20//10
31//1648
41//896
53//16144
61//4192
71//4192
85//16240
93//8288
107//16336
111//2384
121//2384
139//16432
145//8480
1511//16528
163//4576
173//4576
187//8672
191//1768
201//1768

5) Exploratory Analysis

  1. Pitch histogram
  2. Pitch class histogram
  3. Pitch class histogram per instrument
  4. Note duration histogram per instrument
  5. Beat histogram (onsets per beat) per instrument

5.1) Plot a pitch histogram

In [34]:
# select the the instruments other than the drums
not_percussive_notes = notes[notes[:track] .!= 7, :]
head(not_percussive_notes)
Out[34]:
onset_ticksoffset_ticksonset_wholesoffset_wholesonset_secsoffset_secspitchvelocitytrackchannelkey_sharpskey_major
12976321431//81607//3846.8382287.38510241666666786110420true
22976307031//81535//3846.8382287.05422041666666662110530true
32976307031//81535//3846.8382287.05422041666666669110530true
42976307031//81535//3846.8382287.05422041666666674110530true
52976307031//81535//3846.8382287.05422041666666638110640true
6307232144//11607//3847.0588167.38510241666666764110530true
In [35]:
# convert midi notes to pitch numbers
pitches = [note.pitch for note in not_percussive_notes[:pitch]]
pitches[1:10]
Out[35]:
10-element Array{Int64,1}:
 86
 62
 69
 74
 38
 64
 71
 76
 40
 86
In [36]:
# plot pitch histogram
histogram(pitches, bins=collect(0:127))
Out[36]:
0 20 40 60 80 100 120 0 100 200 300 400 500 600 y1

5.2) Plot a pitch class histogram

In [37]:
# plot pitch class histogram
pitch_classes = [mod(p, 12) for p in pitches]
xticks = (collect(0:11) .+ 0.5, collect(0:11))
histogram(pitch_classes, bins=12, xticks=xticks)
Out[37]:
0 1 2 3 4 5 6 7 8 9 10 11 0 250 500 750 1000 1250 y1

Looks like a C major scale!

In [38]:
# shift the histogram so that the most prominent note is in front
shifted_xticks = (collect(0:11) .+ 0.5, [mod(pc+7, 12) for pc in 0:11])
histogram(
    [mod(pc-7, 12) for pc in pitch_classes], 
    bins=12, 
    xticks=shifted_xticks
)
Out[38]:
7 8 9 10 11 0 1 2 3 4 5 6 0 250 500 750 1000 1250 y1
In [39]:
transform(p) = mod((p+4)*7, 12)
histogram(
    transform.(pitch_classes), 
    bins   = 12, 
    xticks = (transform.(collect(0:11)) .+ 0.5, (collect(0:11))), 
    xlim   = (0,12)
)
Out[39]:
0 1 2 3 4 5 6 7 8 9 10 11 0 250 500 750 1000 1250 y1

5.3) Plot a Pitch Class Histogram per Instrument

In [40]:
for track in 2:6
    display(
        histogram(
            [mod(note.pitch, 12) for note in notes[notes[:track] .== track, :pitch]], 
            bins   = collect(0:12),
            xticks = xticks,
            title  = track_names[track],
            xlim = (0,12)
        )
    )
end
0 1 2 3 4 5 6 7 8 9 10 11 0 10 20 30 40 50 vocals y1
0 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 20 25 30 35 vocal harmony y1
0 1 2 3 4 5 6 7 8 9 10 11 0 100 200 300 400 guitar 1 y1
0 1 2 3 4 5 6 7 8 9 10 11 0 100 200 300 400 500 guitar 2 y1
0 1 2 3 4 5 6 7 8 9 10 11 0 100 200 300 400 bass y1

5.4) Plot a Note Duration Histogram per Instrument

In [41]:
onsets  = notes[notes[:track] .== 4, :onset_wholes]
offsets = notes[notes[:track] .== 4, :offset_wholes]
durations = offsets - onsets
Out[41]:
1691-element DataArrays.DataArray{Rational{Int64},1}:
 119//384
  71//384
   9//16 
  71//384
  71//384
   9//16 
  71//384
  71//384
   9//16 
  71//384
  71//384
   9//16 
  71//384
    ⋮    
   5//128
   5//128
   5//128
   5//128
   5//128
   5//128
 263//384
 263//384
 263//384
   5//24 
   5//24 
   5//24 
In [42]:
[(d.num + 1) // d.den for d in durations]
Out[42]:
1691-element Array{Rational{Int64},1}:
  5//16
  3//16
  5//8 
  3//16
  3//16
  5//8 
  3//16
  3//16
  5//8 
  3//16
  3//16
  5//8 
  3//16
   ⋮   
  3//64
  3//64
  3//64
  3//64
  3//64
  3//64
 11//16
 11//16
 11//16
  1//4 
  1//4 
  1//4 
In [43]:
for track in 2:6
    onsets  = notes[notes[:track] .== track, :onset_wholes]
    offsets = notes[notes[:track] .== track, :offset_wholes]
    durations = offsets - onsets
    display(histogram([(d.num + 1) // d.den for d in durations], title=track_names[track]))
end
0 1 2 3 0 20 40 60 80 100 120 vocals y1
0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 35 vocal harmony y1
0.0 0.5 1.0 1.5 2.0 0 200 400 600 guitar 1 y1
0.0 0.5 1.0 1.5 2.0 0 200 400 600 800 guitar 2 y1
0.0 0.5 1.0 1.5 0 200 400 600 bass y1

5.5) Plot beat histogram (onsets per beat) per instrument

In [44]:
for track in 2:6
    onsets  = notes[notes[:track] .== track, :onset_wholes]
    display(
        histogram(
            [o - floor(o) for o in onsets], 
            bins  = [x*1/16 for x in 0:16], 
            title = track_names[track]
        )
    )
end
0.00 0.25 0.50 0.75 1.00 0 10 20 30 40 50 60 vocals y1
0.00 0.25 0.50 0.75 1.00 0 10 20 30 vocal harmony y1
0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 guitar 1 y1
0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 250 guitar 2 y1
0.00 0.25 0.50 0.75 1.00 0 50 100 150 bass y1