The 538 Riddler for March 22, 2019 asks us to simulate baseball using probabilities from a 19th century dice game called Our National Ball Game. The Riddler description of the rules said you can assume some standard baseball things but left many things unspecified, so I looked up the original rules of Our National Ball Game, which are shown below and which, it turns out, contradict some of the standard baseball things assumed by 538. I'll go with the rules as stated below.
RULES FOR PLAYING "OUR NATIONAL BALL GAME" | DICE ROLL OUTCOMES |
---|---|
Note: There is a great history of dice baseball here.
1,1
has probability 1/36, whereas 1,2
has probability 2/36, because it also represents 2,1
.(7/36)**3 == 0.00735
, and call that a strikeout event.1
, 2
, 3
, 4
: one-, two-, three-, and four-base (home run) hits. Runners advance same number of bases.B
: base on balls. Runners advance only if forced.D
: double play. Batter and runner nearest home are out; others advance one base.E
: error. Batter reaches first and all runners advance one base.F
, K
, O
: fly out, strikeout, foul out. Batter is out, runners do not advance.S
: called "out at first" in rules, but actually a sacrifice. Batter is out, runners advance one base.import matplotlib.pyplot as plt
import random
from statistics import mean, stdev
from collections import Counter
from itertools import islice
event_codes = {
'1': 'single', '2': 'double', '3': 'triple', '4': 'home run',
'B': 'base on balls', 'D': 'double play', 'E': 'error',
'F': 'fly out', 'K': 'strikeout', 'O': 'foul out', 'S': 'out at first'}
I'll define the function inning
to simulate a half inning and return the number of runs scored. Design choices for inning
:
runners = {1, 3}
means runners on first and third.runs
and outs
in an inning, and return the number of runs
when there are three outs
.runners = {1, 3}
and the event is '2'
(a double), then the steps are:0
, so now runners = {0, 1, 3}
.advance(r, e)
. In this case, runners = {2, 3, 5}
.scored
and increment runs
accordingly. In this case, runner 5
has scored, so we increment runs
by 1 and end up with runners = {2, 3}
.inning
to be easily testable: I want to say assert 2 = inning('1KO4F')
.inning
to be capable of simulating many independent random innings. So the interface is to accept an iterable of event codes. That could be string, or a generator, as provided by event_stream()
.inning
to be loggable: calling inning(events, verbose=True)
should produce printed output for each event.advance(r, e)
says that a runner advances e
bases on an e
base hit; one base on an error, sacrifice, or double play; and one base on a base on balls only if forced.r
is forced
if all the lower-numbered bases have runners.ONBG
is defined as a generator of random events with the probabilities from "Our National Ball Game".def inning(events, verbose=True) -> int:
"""Simulate a half inning based on events, and return number of runs scored."""
outs = runs = 0 # Inning starts with no outs and no runs,
runners = set() # and with nobody on base
def out(r) -> int: runners.remove(r); return 1
def forced(r) -> bool: return all(b in runners for b in range(r))
def advance(r, e) -> int:
return int(e if e in '1234' else (e in 'ESD' or (e == 'B' and forced(r))))
for e in events:
if verbose: show(outs, runs, runners, e)
runners.add(batter) # Batter steps up to the plate
if e == 'D' and len(runners) > 1: # Double play: batter and lead runner out
outs += out(batter) + out(max(runners))
elif e in 'DSKOF': # Batter is out
outs += out(batter)
if outs >= 3: # If inning is over: return runs scored
return runs
runners = {r + advance(r, e) for r in runners} # Runners advance
runs += len(runners & scored) # Tally runs
runners = runners - scored # Remove runners who scored
def event_stream(events, strikes=0):
"""A generator of random baseball events."""
while True:
yield 'K' if (random.random() < strikes ** 3) else random.choice(events)
def show(outs, runs, runners, event):
"""Print a representation of the current state of play."""
bases = ''.join(b if int(b) in runners else '-' for b in '321')
print(f'{outs} outs {runs} runs {bases} {event} ({event_codes[event]})')
ONBG = event_stream('2111111EEBBOOSSSSSSSFFFFFD334', 7/36) # Our National Ball Game
batter = 0 # The batter is not yet at first base
scored = {4, 5, 6, 7} # Runners in these positions have scored
Let's peek at some random innings:
inning(ONBG)
0 outs 0 runs --- 3 (triple) 0 outs 0 runs 3-- F (fly out) 1 outs 0 runs 3-- S (out at first) 2 outs 1 runs --- F (fly out)
1
inning(ONBG)
0 outs 0 runs --- F (fly out) 1 outs 0 runs --- S (out at first) 2 outs 0 runs --- S (out at first)
0
Let's also test some historic innings. I'll take some of the Red Sox innings from their 2004 playoff series against the Yankees.
# 7th inning in game 1: 5 runs (Homer by Varitek)
# (But not a perfect reproduction, because our simulation doesn't have passed balls.)
assert 5 == inning('K2O1214K')
0 outs 0 runs --- K (strikeout) 1 outs 0 runs --- 2 (double) 1 outs 0 runs -2- O (foul out) 2 outs 0 runs -2- 1 (single) 2 outs 0 runs 3-1 2 (double) 2 outs 1 runs 32- 1 (single) 2 outs 2 runs 3-1 4 (home run) 2 outs 5 runs --- K (strikeout)
# 4th inning in game 6: 4 runs (Homer by Bellhorn)
assert 4 == inning('SS2114F')
0 outs 0 runs --- S (out at first) 1 outs 0 runs --- S (out at first) 2 outs 0 runs --- 2 (double) 2 outs 0 runs -2- 1 (single) 2 outs 0 runs 3-1 1 (single) 2 outs 1 runs -21 4 (home run) 2 outs 4 runs --- F (fly out)
# 2nd inning in game 7: 4 runs (Grand Slam by Damon)
assert 4 == inning('S1BB4BFS')
0 outs 0 runs --- S (out at first) 1 outs 0 runs --- 1 (single) 1 outs 0 runs --1 B (base on balls) 1 outs 0 runs -21 B (base on balls) 1 outs 0 runs 321 4 (home run) 1 outs 4 runs --- B (base on balls) 1 outs 4 runs --1 F (fly out) 2 outs 4 runs --1 S (out at first)
That looks good to me.
Now, simulate a hundred thousand innings, and then sample from them to simulate a hundred thousand nine-inning games (for one team), and show histograms of the results, labelled with statistics:
def simulate(N=100000, inning=inning, events=ONBG) -> None:
innings = [inning(events=events, verbose=False) for _ in range(N)]
games = [sum(random.sample(innings, 9)) for _ in range(N)]
hist(innings, 'Runs/inning (for one team)')
hist(games, 'Runs/game (for one team)')
def hist(nums, title):
"""Plot a histogram and show some statistics."""
plt.hist(nums, ec='black', bins=max(nums)-min(nums), align='left')
plt.xlabel(title)
plt.title(f'μ: {mean(nums):.2f}, σ: {stdev(nums):.2f}, max: {max(nums)}')
plt.show()
%time simulate()
CPU times: user 2.52 s, sys: 16.6 ms, total: 2.54 s Wall time: 2.58 s
So, about 13 runs per game (per team). This shows that the dice game is not very realistic with respect to current-day baseball. It is true that games were higher-scoring 130 years ago, and perhaps a dice game is more fun when there is a lot of action.
Could I make the game reflect baseball as it is played today? To do so I would need:
inning
.inning
, depending on how the conversion goes.Baseball-reference.com has lots of stats, in particular MLB annual batting stats and fielding stats; I'll use the stats for the complete 2019 season. The batting stats have most of what we need, and the fielding stats give us double plays and errors.
I start by defining two utility functions that can be useful for any tabular data: cell_value
, which converts a table cell entry into an int
, float
, or str
as appropriate; and header_row_dict
, which creates a dict of {column_name: value}
entries. The function mlb_convert
then converts this format (a dict keyed by H/2B/3B/HR
etc.) into the event code format (a string of '1234...'
). As part of the conversion I'll add hit-by-pitch (HBP
) into the "base on balls" category, and I'll record all otherwise unaccounted-for outs under the "fly out" (F
) category (runners do not advance). With this understood, we won't need to change the function inning
at all. (It is true that mlb_convert
returns a very long string, equal in length to the number of plate appearances over the whole MLB season. But that takes up less space than storing one photo, so I'm not going to worry about it.)
def cell_value(entry, types=(int, float, str)):
"""Convert a cell entry into the first type that doesn't raise an error."""
for typ in types:
try:
return typ(entry)
except ValueError:
pass
def header_row_dict(header, row, sep=None, value=cell_value) -> dict:
"""Parse a header and table row into a dict of `{column_name: value(cell)}`."""
return dict(zip(header.split(sep), map(value, row.split(sep))))
def mlb_convert(stats: dict) -> str:
"""Given baseball stats return a string '11...FFF'."""
events = Counter({
'1': stats['H'] - stats['2B'] - stats['3B'] - stats['HR'],
'2': stats['2B'], '3': stats['3B'], '4': stats['HR'],
'E': stats['E'], 'B': stats['BB'] + stats['HBP'],
'K': stats['SO'], 'D': stats['DP'], 'S': stats['SH'] + stats['SF']})
events['F'] = stats['PA'] - sum(events.values()) # All unaccounted-for outs
return ''.join(events.elements()) # A str of events
Below I copy-and-paste the data I need from baseball-reference.com to create the dict mlb_stats
; convert it to the string mlb_string
; and use that to create the event generator mlb_stream
:
mlb_stats = header_row_dict(
"Year Tms #Bat BatAge R/G G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS TB GDP HBP SH SF IBB E DP",
"""2019 30 1284 27.9 4.84 4828 185377 165622 23346 41794 8485 783 6735 22358 2261 827 15806 42546
.252 .323 .435 .758 72050 3441 1968 774 1146 752 2882 3981""")
mlb_string = mlb_convert(mlb_stats)
mlb_stream = event_stream(mlb_string)
We can take a look:
mlb_stats
{'Year': 2019, 'Tms': 30, '#Bat': 1284, 'BatAge': 27.9, 'R/G': 4.84, 'G': 4828, 'PA': 185377, 'AB': 165622, 'R': 23346, 'H': 41794, '2B': 8485, '3B': 783, 'HR': 6735, 'RBI': 22358, 'SB': 2261, 'CS': 827, 'BB': 15806, 'SO': 42546, 'BA': 0.252, 'OBP': 0.323, 'SLG': 0.435, 'OPS': 0.758, 'TB': 72050, 'GDP': 3441, 'HBP': 1968, 'SH': 774, 'SF': 1146, 'IBB': 752, 'E': 2882, 'DP': 3981}
mlb_string[::1000] # Just look at every 1000th character
'111111111111111111111111112222222223444444EEEBBBBBBBBBBBBBBBBBBKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKDDDDSSFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF'
inning(mlb_stream)
0 outs 0 runs --- 2 (double) 0 outs 0 runs -2- F (fly out) 1 outs 0 runs -2- F (fly out) 2 outs 0 runs -2- F (fly out)
0
I can simulate:
%time simulate(events=mlb_stream)
CPU times: user 2.39 s, sys: 21.8 ms, total: 2.41 s Wall time: 2.46 s
That looks a lot more like real baseball. But MLB averaged 4.84 runs per team per game in 2019, and this is significantly lower. I think we can make some minor changes to the function inning
—some "standard baseball things"—to make the simulation more realistic. I'm thinking of two changes:
I'll make those two things the case for all double plays and singles.
def inning2(events, verbose=False) -> int:
"""Simulate a half inning based on events, and return number of runs scored."""
outs = runs = 0 # Inning starts with no outs and no runs,
runners = set() # and with nobody on base
def out(r) -> int: runners.remove(r); return 1
def forced(r) -> bool: return all(b in runners for b in range(r))
def advance(r, e) -> int:
return ((2 if r == 2 else int(e)) if e in '1234' else
(e in 'ESD' or (e == 'B' and forced(r))))
for e in events:
if verbose: show(outs, runs, runners, e)
runners.add(batter) # Batter steps up to the plate
if e == 'D' and 1 in runners: # Double play: batter and runner on first out
outs += out(batter) + out(1)
elif e in 'DSKOF': # Batter is out
outs += out(batter)
if outs >= 3: # If inning is over: return runs scored
return runs
runners = {r + advance(r, e) for r in runners} # Runners advance
runs += len(runners & scored) # Tally runs
runners = runners - scored # Remove runners who scored
We show the difference with two examples. First, a triple/walk/double-play sequence scores a run under inning2
but not inning
:
inning2('3BDK', True)
0 outs 0 runs --- 3 (triple) 0 outs 0 runs 3-- B (base on balls) 0 outs 0 runs 3-1 D (double play) 2 outs 1 runs --- K (strikeout)
1
inning('3BDK', True)
0 outs 0 runs --- 3 (triple) 0 outs 0 runs 3-- B (base on balls) 0 outs 0 runs 3-1 D (double play) 2 outs 0 runs -2- K (strikeout)
0
Second, a double/single sequence scores a run under inning2
but not inning
:
inning2('21FFF', True)
0 outs 0 runs --- 2 (double) 0 outs 0 runs -2- 1 (single) 0 outs 1 runs --1 F (fly out) 1 outs 1 runs --1 F (fly out) 2 outs 1 runs --1 F (fly out)
1
inning('21FFF', True)
0 outs 0 runs --- 2 (double) 0 outs 0 runs -2- 1 (single) 0 outs 0 runs 3-1 F (fly out) 1 outs 0 runs 3-1 F (fly out) 2 outs 0 runs 3-1 F (fly out)
0
We can simulate again and note any differences:
%time simulate(events=mlb_stream, inning=inning2)
CPU times: user 2.37 s, sys: 17.1 ms, total: 2.39 s Wall time: 2.41 s
There is a slight increase in the number of runs.
There are many problems with the code as it is. For example:
D
event code would come up that often, but perhaps only half the time there would be a runner and less than two outs, so we would only actually get a double play maybe 1% of the time.What can you do to make the simulation better?