Max Kapur | maxkapur.com
Surrogates.jl is an excellent Julia package that allows you to build surrogate models for black box functions. This approach interested me because many of the demand functions I use to model admissions markets are computationally very expensive.
For example, the model considered here assumes that schools admit students based on a convex combination of their characteristics—a linear assessment function of the form $$u_c(x) = \sum_{i=1}^n A_{ci} x_i$$ for each school $c$. The $A_{ci} \geq 0$, with $\sum A_{c.} = 1$, are the weights the school gives to the traits $x_i$. The traits are uniformly, independently distributed on $[0,1]$.
The matrix $A$ determines the probability of a student having a given consideration set $C^\# \in C!$, which is the $|C|$-dimensional volume of the following polyhedron:
$$\begin{align} A_{c.} x \geq p_c, \quad \forall c \in C^\# \\ A_{c.} x < p_c, \quad \forall c \notin C^\# \\ 0 \leq x \leq 1 \end{align} $$From $C^\#$, students choose thair favorite school according to a multinomial logit choice model. Letting $V\left(C^\#\right)$ denote the volume of the polyhedron, the demand for school $c$ is $$D_c(p) = \sum_{\substack{C^\# \in 2^{C}:\\ c \in C^\#}} \frac{\exp{\delta_c}}{\sum_{d\in C^\#} \exp{\delta_d}} V\left(C^\#\right)$$
Under these assumptions, the market equilibrium exists and is unique, but computing it is quite difficult in general because there are $2^{|C|}$ possible consideration sets (many with probability zero); hence the demand function is exponential in the number of schools.
The question here is if we can "learn" the form of $D$ from a small number of sample points. If so, and if the approximation is accurate enough, then the surrogate $D$ can be used in place of $D$ itself to search for the equilibrium by a gradient procedure.
In this notebook, I test this idea out and determine that the surrogate optimization approach is not worth it for the task at hand. Using the same number of calls to $D$ needed to form a workable approximation, we can run tâtonnement on $D$ itself and find a more accurate estimate of the equilibrium.
using Combinatorics
using Plots, Plots.PlotMeasures
using Polyhedra
using LinearAlgebra
using Surrogates
First I define some basic functions.
"""
Contains static information about a school-choice market.
"""
struct Market
qualities::Array{<:AbstractFloat, 1}
blends::Array{<:AbstractFloat, 2}
capacities::Array{<:AbstractFloat, 1}
end
"""
demand(market, cutoffs)
Return demand for each school given a set of cutoffs and ignoring capacity, using
multinomial logit choice model with one student profile and `t` test scores,
which represent orthogonal dimensions of student quality. Each school uses
a convex combination of the tests scores to evaluate students; the rows of `blends`
give these combinations. `qualities` are the MNL preferability parameters.
A simplification of `demands_pMNL_ttests()` from `DeferredAcceptance` package.
"""
function demand(market ::Market,
cutoffs ::AbstractArray{<:AbstractFloat, 1};
)::AbstractArray{<:AbstractFloat, 1}
qualities = market.qualities
blends = market.blends
(m, t) = size(blends)
demands = zeros(m)
γ = exp.(qualities)
bounds = vcat([HalfSpace(-col, 0.) for col in eachcol(I(t))],
[HalfSpace(1col, 1.) for col in eachcol(I(t))])
for C♯ in powerset(1:m) # Possible choice sets
if !isempty(C♯)
hspaces = HalfSpace{Float64,Array{Float64,1}}[]
# Probability of having this choice set is the volume of
# this m-dimensional polyhedron.
for c in C♯
if cutoffs[c] ≥ 1
@goto volume_is_zero
elseif cutoffs[c] > 0
push!(hspaces, HalfSpace(-blends[c, :], -cutoffs[c]))
end
end
for c in setdiff(1:m, C♯)
if cutoffs[c] ≤ 0
@goto volume_is_zero
elseif cutoffs[c] < 1
push!(hspaces, HalfSpace( blends[c, :], cutoffs[c]))
end
end
poly = polyhedron(hrep(vcat(bounds, hspaces)))
vol = volume(poly)
# @show vol, poly
mult = vol / sum(γ[C♯])
for c in C♯
demands[c] += mult * γ[c]
end
@label volume_is_zero
end
end
return demands
end
"""
quickeq(market, demand)
Very low-tech tatonnement procedure for finding equibilibrium.
"""
function quickeq(market::Market, demand::Function; maxit::Int=100)
p = rand(length(market.qualities))
for i in 1:maxit
p = max.(0, p + .5 * (demand(market, p) - market.capacities) / i^0.001)
# @show p
end
return p
end
normexcessdemand(market, cutoffs) = norm(demand(market, cutoffs) - market.capacities)
normexcessdemand (generic function with 1 method)
Here is a sample market with two schools and two dimensions of student quality. Test that the demand function is working with some random cutoffs. Looks like school 1 is getting too few students, and school 2 is getting too many—disequilibrium.
const savannah = Market([1.1, 0.4] , # Preferability of each school
[.2 .8; # School assessment functions
.6 .4] ,
[.4, .2] # Capacity of each school
)
demand(savannah, [.6, .2]) - savannah.capacities
2-element Vector{Float64}: -0.1494295854369378 0.4660962521036042
Here are the "real" market-clearing cutoffs. We can run 500 iterations of tâtonnement because this market is very small; this will not be the case for a larger market. Observe that the demand matches the capacity of the schools.
p̂ = quickeq(savannah, demand; maxit=500)
@show p̂
demand(savannah, p̂)
p̂ = [0.4935991474928613, 0.5495075370019941]
2-element Vector{Float64}: 0.4 0.1999999999999999
To show visually that the equilibrium is unique, plot the norm of the excess demand vector in cutoff space. Since the total capacity is less than one, both schools will be completely full at equilibrium, and the equilibrium coincides with the minimum of this function, when the norm is zero.
(The "ridges" in the graph are the artifact of a known bug in Polyhedra.jl, which is used to compute the volume of the polyhedra described above. Since these ridges do not occur at the equilibrium, they do not affect our computations.)
function plot_doer()
xs = ys = 0:0.01:1
pl = plot(xlim=(0, 1), ylim=(0, 1), camera=(60, 45), size=(570, 500), #background_color=:gray5,
xlabel="Cutoff at 1", ylabel="Cutoff at 2")
plot!(pl, xs, ys,
(x, y)->normexcessdemand(savannah, [x, y]),
st=:contourf, color=:summer, lc=:black, lw=.5, title="Norm of excess demand")
scatter!(pl, [p̂[1]], [p̂[2]], c=:black, label="Equilibrium", marker=:X)
return pl
end
norm_excess_demand = plot_doer()
Since the excess demand is nonzero everywhere except at the equilibrium, minimizing the norm of the excess demand is equivalent to finding the cutoffs. So, let's try to create a surrogate for the norm of the excess demand function and minimize it using Surrogates.jl.
function surrogatenormexcessdemand(market::Market; n_samples::Int=35)
m = length(market.qualities)
lb = zeros(m)
ub = ones(m)
ps = sample(n_samples, lb, ub, SobolSample())
zs = [normexcessdemand(market, collect(p)) for p in ps]
return RadialBasis(ps, zs, lb, ub)
end
surr = surrogatenormexcessdemand(savannah)
(::RadialBasis{Surrogates.var"#1#2", Int64, Vector{Tuple{Float64, Float64}}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Float64, Bool}) (generic function with 1 method)
We built the surrogate using 35 samples, and we will let surrogate_optimize()
collect 5 more, for a total cost of 40 calls to demand()
.
p_surropt40 = surrogate_optimize(p->normexcessdemand(savannah, collect(p)),
DYCORS(), # Fiddle with this; DYCORS hangs sometimes
[0, 0],
[1, 1],
surr,
SobolSample(),
num_new_samples=5)[1] |> collect
2-element Vector{Float64}: 0.49368421723188044 0.5495630287863199
demand(savannah, p_surropt40)
2-element Vector{Float64}: 0.3999239816657382 0.19996856833794543
Let's see how well tâtonnement does with 40 evaluations:
p_tat40 = quickeq(savannah, demand; maxit=40)
demand(savannah, p_tat40)
2-element Vector{Float64}: 0.40000000030304833 0.20000000032769302
The error is several orders of magnitude smaller when using plain tâtonnement and the same number of demand computations. Hence, we will not pursue this approach further.
Instead, let's try to approximate the demand function directly, which is vector-valued. Here is a plot of each of its components. In this view, it looks like the equilibrium is just at a "random" point in the middle, but if you inspect the colorbar, you can see that the demand for each school matches its capacity, meaning the school has no incentive to raise or lower its cutoff.
function plot_doer()
xs = ys = 0:0.02:1
pl = plot(xlim=(0, 1), ylim=(0, 1), camera=(60, 45), size=(1200, 500), #background_color=:gray5,
xlabel="Cutoff at 1", ylabel="Cutoff at 2", layout=2, margin=30px)
plot!(pl, xs, ys,
(x, y)->demand(savannah, [x, y])[1],
st=:contourf, color=:viridis, subplot=1, lc=:black, lw=.5, title="Demand at 1", legend=nothing)
plot!(pl, xs, ys,
(x, y)->demand(savannah, [x, y])[2],
st=:contourf, color=:viridis, subplot=2, lc=:black, lw=.5, title="Demand at 2")
for i in 1:2 scatter!(pl, [p̂[1]], [p̂[2]], c=:black, label="Equilibrium", marker=:X, subplot=i) end
return pl
end
savannah_excess_demand = plot_doer()
Build a surrogate. I will use the radial basis form, which is advertised as an efficient choice for vector-valued functions like this one.
function surrogatedemand(market::Market, demand::Function; n_samples::Int=40)
m = length(market.qualities)
lb = zeros(m)
ub = ones(m)
ps = sample(n_samples, lb, ub, SobolSample())
zs = [demand(market, collect(p)) for p in ps]
return RadialBasis(ps, zs, lb, ub)
end
surrogatedemand (generic function with 1 method)
surr = surrogatedemand(savannah, demand)
(::RadialBasis{Surrogates.var"#1#2", Int64, Vector{Tuple{Float64, Float64}}, Vector{Vector{Float64}}, Vector{Float64}, Vector{Float64}, Matrix{Float64}, Float64, Bool}) (generic function with 1 method)
Compare the surrogate with the actual function plotted above.
function plot_doer()
xs = ys = 0:0.01:1
pl = plot(xlim=(0, 1), ylim=(0, 1), camera=(60, 45), size=(1200, 500), #background_color=:gray5,
xlabel="Cutoff at 1", ylabel="Cutoff at 2", layout=2, margin=30px)
plot!(pl, xs, ys,
(x, y)->surr([x, y])[1],
st=:contourf, color=:winter, subplot=1, lc=:black, lw=.5, title="Demand at 1 (surrogate)", legend=nothing)
plot!(pl, xs, ys,
(x, y)->surr([x, y])[2],
st=:contourf, color=:winter, subplot=2, lc=:black, lw=.5, title="Demand at 2 (surrogate)")
for i in 1:2 scatter!(pl, [p̂[1]], [p̂[2]], c=:black, label="Equilibrium", marker=:X, subplot=i) end
return pl
end
savannah_surr_demand = plot_doer()
Let's try using the surrogate to search for the equilibrium.
# Make a version that accepts (and ignores) market so we can use the functions above
surr_(market, cut) = surr(cut)
surr_ (generic function with 1 method)
p_surrdem40 = quickeq(savannah, surr_)
2-element Vector{Float64}: 0.49635827665338983 0.5578961053743474
demand(savannah, p_surrdem40)
2-element Vector{Float64}: 0.3994736543013412 0.19190437740859856
This is even worse than the norm-approximation approach.
Let's see what happens if we use even fewer sample points.
surr = surrogatedemand(savannah, demand, n_samples=15)
p_surrdem15 = quickeq(savannah, surr_)
demand(savannah, p_surrdem15)
2-element Vector{Float64}: 0.38923162690009205 0.17763448480492638
p_tat15 = quickeq(savannah, demand, maxit=15)
demand(savannah, p_tat15)
2-element Vector{Float64}: 0.39995895845008295 0.19995562941348669
Again, tâtonnement produces an even better solution.
for p in [p_surropt40, p_surrdem40, p_tat40, p_surrdem15, p_tat15]
@show p, demand(savannah, p)
end
(p, demand(savannah, p)) = ([0.49368421723188044, 0.5495630287863199], [0.3999239816657382, 0.19996856833794543]) (p, demand(savannah, p)) = ([0.49635827665338983, 0.5578961053743474], [0.3994736543013412, 0.19190437740859856]) (p, demand(savannah, p)) = ([0.49359914709764924, 0.5495075365746425], [0.40000000030304833, 0.20000000032769302]) (p, demand(savannah, p)) = ([0.5102903939006787, 0.5751249738116424], [0.38923162690009205, 0.17763448480492638]) (p, demand(savannah, p)) = ([0.4936526659410213, 0.5495654005244548], [0.39995895845008295, 0.19995562941348669])
When there are more tests and more schools, the demand function behaves less regularly, so perhaps the surrogate approach will win out. Let's try a market with 10 schools and 3 dimensions of student quality, which is about where the demand function takes several seconds to compute.
function market_maker()
m = Market(rand(10), rand(10, 2), rand(10))
m.blends ./= sum(m.blends, dims=2)
m.capacities ./= (1.2 * sum(m.capacities))
return m
end
const calgary = market_maker()
Market([0.16333014909028987, 0.4015746274721341, 0.9725665288155467, 0.49047141379235715, 0.34656874905283885, 0.8947037747521398, 0.4191281034859835, 0.013600856977067588, 0.05059132030944369, 0.8501736850989479], [0.4729753110705832 0.5270246889294168; 0.6762106435791608 0.3237893564208392; … ; 0.6566526947280408 0.3433473052719592; 0.5409020866181818 0.45909791338181816], [0.07297423362589076, 0.09700670678807741, 0.13447551122187815, 0.05990632585326526, 0.08298426139429375, 0.11851810450728244, 0.022341699795939567, 0.04265566768614191, 0.09195985031282516, 0.11051097214773907])
surr = surrogatedemand(calgary, demand, n_samples=30)
p̄ = quickeq(calgary, surr_)
@show p̄
p̄ = [1.5936953719031515, 1.3583374783291244, 1.158068527095798, 1.7386316510269695, 1.1160638972934298, 1.2510581802640908, 1.6945671120725774, 1.085621010478172, 0.6241267709291565, 0.930492789351918]
10-element Vector{Float64}: 1.5936953719031515 1.3583374783291244 1.158068527095798 1.7386316510269695 1.1160638972934298 1.2510581802640908 1.6945671120725774 1.085621010478172 0.6241267709291565 0.930492789351918
Let's clamp these cutoffs to $[0,1]$ and check the demand.
p_surrdem30 = min.(1, max.(0, p̄))
demand(calgary, p_surrdem30) - calgary.capacities
10-element Vector{Float64}: -0.07297423362589076 -0.09700670678807741 -0.13447551122187815 -0.05990632585326526 -0.08298426139429375 -0.11851810450728244 -0.022341699795939567 -0.04265566768614191 0.212299660837585 -0.1038000447782328
What if we just ran 30 iterations of tâtonnement instead of using 30 samples to build an approximation?
p_tat30 = quickeq(calgary, demand, maxit=30)
demand(calgary, p_tat30) - calgary.capacities
10-element Vector{Float64}: -0.00292243479759792 -0.00295276260728973 -0.004481175469125215 -0.006384545779585499 -0.004582294180795865 -0.0029756614273587223 0.0009647166683316409 -0.001299175444104829 -0.003244550093890089 -0.001472431728600218
Again, plain old tâtonnement is clearly the superior method here.
The Surrogates.jl docs break the surrogate optimization method down into three steps:
- Sample selection
- Construction of the surrogate model
- Surrogate optimization
The market equilibrium problem is not formally an optimization problem; it is a root-finding problem, and it is known that under light conditions on the demand functions, a tâtonnement process that iterates by moving in the direction of the demand converges linearly toward the market-clearing solution.
In instances like those given here, where the total capacity of the schools is less than the number of students and therefore the excess demand is exactly zero at and only at the equilibrium, we can coerce the problem into an optimization form by minimizing the norm of the excess demand function. Using this trick, I tried the surrogate optimization approach, but it turned out not to be very cost effective. The number of sample points required simply to build a workable surrogate is much higher than the number needed to find a fairly precise equilibrium by tâtonnement, and the surrogate optimizer requires additional demand evaluations beyond that.
Sidestepping the optimization problem (point 3), I tried developing a surrogate for the excess demand vector itself, and applied the tâtonnement procedure to this surrogate instead of the original. This yielded even poorer results than the norm approach.
The superiority of the tâtonnement procedure was even clearer in a larger market.
Theoretically, the reason that direct evaluation is worth the cost in equilibrium problems is that the excess demand vector is an improvement direction. It provides not only information about the "status" of the market at the given prices, but points in the direction of more efficient prices.
In cases where the demand is truly computationally intractable, the best way to approximate the market-clearing cutoffs remains the one suggested by Azevedo and Leshno (2016): generate a large, discrete instance of the admissions market, run the student-proposing and school-proposing deferred acceptance algorithm, and take the average of the realized cutoffs.
Azevedo, Eduardo M. and Jacob D. Leshno. 2016. “A Supply and Demand Framework for Two-Sided Matching Markets.” Journal of Political Economy 124, no. 5 (Sept.): 1235–68.