This notebook is part of the supplementary material for:
Genewein T., Leibfried F., Grau-Moya J., Braun D.A. (2015) Bounded rationality, abstraction and hierarchical decision-making: an information-theoretic optimality principle, Frontiers in Robotics and AI.
More information on how to run the notebook on the accompanying github repsitory where you can also find updated versions of the code and notebooks.
This notebook corresponds to Section 3 and reproduces Figures 5, 6, 7 and 8 of the paper.
Note that the notebook is not limited to the serial case as it computes the serial-case solutions through the general case by setting the inverse temperatures accordingly. The parallel or general case can thus be easily reproduced with different settings of the $\beta$-values.
Here, we design a two-stage serial information processing system that first processes a world-state $w$ into an internal percept $x$ and then uses this percept to compute an action $a$. We assume that the action is conditionally independent of the world-state given the percept. This simply means that there is no direct flow of information from $W$ to $A$ and corresponds to the following graphical model $$ W \rightarrow X \rightarrow A$$ that leads to $p(w,x,a)=p(w)p(x|w)p(a|x)$.
Each stage of the information processing system is characterized by an information-theoretic channel
We want to maximize the utility function $U(w,a)$ while at the same time not exceeding some limits on the information processing rates on each stage. This can be formalized with the following variational problem $$\underset{p(x|w),p(a|x)}{\operatorname{arg~max}}~\mathbf{E}_{p(w,x,a)}[U(w,a)] - \frac{1}{\beta_1} I(W;X) - \frac{1}{\beta_2} I(O;A)$$
The solution to the optimization problem is given by $$\begin{align} p^*(x|w)&=\frac{1}{Z(w)}p(x)\exp\left(\beta_1 \Delta F_{\text{ser}}(w,x)\right)\\ p(x)&=\sum_w p(w)p^*(x|w)\\ p^*(a|x)&=\frac{1}{Z(x)} p(a) \exp \left(\beta_2 \sum_w p(w|x) U(w,a) \right)\\ p(a)&=\sum_{w,x} p(w)p^*(x|w)p^*(a|x), \end{align}$$ where $Z(w)$ and $Z(x)$ denote the corresponding normalization constants or partition sums. The conditional probability $p(w|x)$ is given by Bayes' rule $p(w|x) = \frac{p(w)p^*(x|w)}{p(x)}$ and $$\Delta F_{\text{ser}}(w,x):=\mathbf{E}_{p^*(a|x)}[U(w,a)] - \frac{1}{\beta_2} D_{\mathrm{KL}}(p^*(a|x)||p(a))$$ is the free energy difference of the action stage.
Interestingly, the solutions above can be used for constructing bounded-optimal perception-action systems. Perception-action systems need to perform inference over a (hidden) world-state $w$ and then use the gathered information $x$ to drive a decision-maker. Classically, inference and decision-making are separated: $$\begin{align} &\text{Bayesian inference:} & p(w|x)&=\frac{p(w)p(x|w)}{\sum_w p(w)p(x|w)} \\ &\text{Bounded rational decision:} & p^*(a|x)&=\frac{1}{Z(x)}p(a)e^{\beta_2 U(x,a)} \end{align}$$ where the last equation describes a bounded rational decision-maker as in Section 2 of the paper. The problem is that in such a case there is no principled way to specify the likelihood model $p(x|w)$. Often it is chosen to represent the world-state as faithfully as possible.
Now compare the two equations for Bayesian inference and bounded rational decision making with the four equations that we got as the solution to our variational problem. Interestingly Bayesian inference and bounded rational decision making drop out naturally - additionally the bounded optimal likelihood model $p^*(x|w)$ is also well-defined. Interestingly it does not try to represent $w$ as faithfully as possible but rather it tries to optimize the downstream free-energy trade-off of the action-part of the system. This leads to a tight coupling between action and perception, which will be illustrated in the example below.
Below, we compare the bounded-optimal perception-action system against a system that uses a handcrafted likelihood $p_\lambda(x|w)$. The model has a tunable precision-parameter $\lambda$ that affects the information processing rate of the perceptual channel $I(X;W)$ - this parameter corresponds to the level of perceptual noise and influences the likelihood of "confusing" one particular $w$ with similar ones.
The observation or percept $x$ is a discretized noisy version of $w$ with precision $\lambda$:
$$x|w,\lambda \sim \text{round}(\mathcal{N}(w,1/\lambda))$$By using this construction it is already obvious that the action-part of the system or the utility function have no influence on the perceptual part of the system. We will highlight this further in the example below.
#only run this once
include("RateDistortionDecisionMaking.jl")
WARNING: replacing module RateDistortionDecisionMaking
RateDistortionDecisionMaking
using RateDistortionDecisionMaking, Distances, DataFrames, Colors, Gadfly, Distributions, Interact, Reactive
#make the default plot size a bit larger
set_default_plot_size(15cm, 12cm)
To illustrate perception-action systems we design the follwing example. The agent is an animal and it encounters nine other kinds of animals that come in three different size-groups - each group has three members.
The agent has sensors to observe $x$ which is a noisy version of $w$. The rate $I(W;X)$ governs how precise or noisy the percept is (for the hand-crafted model the precision is directly controlled by $\lambda$). Medium precision mostly leads to within-group confusion, low precision $\lambda$ leads to strong across-group confusion.
The agent can hunt the animals of the small and medium-sized group for food but it must flee from the large animals as it could fall prey to them. Alternatively, you can change the utility function to a mating scenario, where the agent must try to find a animals of a specific size within the medium-sized group to mate with (see description below the plot of the utility function for more details).
... change the utility function by commenting/uncommenting in the code-block below. If all other parameters are kept the same, we can observe how the bounded optimal percept $p^*(x|w)$ is coupled to the utility function.
#set up predator-prey example
include("PredatorPreyExample.jl")
###### Predator-prey scenario #############
#same utility as used in the main paper
w_values, w_strings, a_values, a_strings, p_w, U = setup_predator_prey_example()
###### Mating scenario ####################
#alternative utility - mating scenario
#w_values, w_strings, a_values, a_strings, p_w, U = setup_predator_prey_example(mating_utility=true)
numa = length(a_strings)
a_vec = collect(1:numa)
numw = length(w_strings)
w_vec = collect(1:numw)
#pre-compute utility
U_pre, Umax = setuputilityarrays(a_values,w_values,U)
#visualize utility
plt_utility = visualizeMatrix(U_pre, w_values, a_values, w_strings, a_strings, xlabel="animal size w",
ylabel="Action a", legendlabel="U(a,w)")
Possible kinds of animals:
Possible kinds of actions:
Compared to the generic sneak-up and ambush patterns, the specific patterns require more computational resources, that is they require a larger rate on the action-channel, or in other words: they require good motor skills and a good motor hardware (which is more costly than a cheap motor system).
If you have bad motoric hardware, you'll never be able to use the specific hunting patterns, therefore it doesn't make sense to waste resources on an expensive perceptual system that allows you to precisely distinguish individual animals. Rather a system that allows you to tell the different groups from each other suffices. Dually, if your perceptual system does not allow you to distinguish between the individual animals, it would be lavish to have an expensive motor system that allows you to accurately execute very specific hunting patterns.
You are a medium sized animal (w=6) and try to find a mate - potential mates are w=8.
Possible actions
Possible kinds of animals:
Below, the hand-crafted perceptual model can be inspected by changing the precision parameter $\lambda$. Note that this will also directly influence the rate $I(W;X)$.
#set up hand-crafted likelihood model and p(x|w)
x_values = collect(1:13)
x_vec = collect(1:length(x_values))
x_strings = map((x)->string(x), x_values)
#slider for selecting λ
λ_vals = 0.1:0.1:5
slider_l = slider(λ_vals,label="Perceptual precision λ")
#use lift to connect the actual plotting-code to the slider
plt_pxgw_sl = lift(λ_vis->begin
p_xgw_hc = pogw_handcrafted(x_values, w_values, λ_vis)
visualizeBAconditional(p_xgw_hc, x_vec, w_vec, x_strings, w_strings,
wlabel="animal size w", alabel="observed size x", legendlabel="p(x|w,λ)")
end, slider_l)
display(slider_l)
display(plt_pxgw_sl)
...drag the slider above to change the perceptual precision and see how observations get more and more "blurry" with decreasing precision. Note that low perceptual precision incurs a low mutual information $I(X;W)$ whereas high precision also incurs a large mutual information between the actual size and the observed size.
If you see the slider but not the plot and "Javascript error adding output!"
, try moving the slider once, then the plot should appear.
If this does not work (i.e. kernel is busy with this cell but nothing ever happens), try ckecking out the most current version of Interact with: Pkg.checkout("Interact")
in a Julia console. You can undo this and go back to the latest released version with Pkg.free("Interact")
When changing the parameters make sure to re-run the corresponding cells below - in case of doubt simply re-run all cells below.
$\lambda=1.65$, $\beta_1 = 8$, $\beta_2 = 10$ (same $I(X;W)$ for both models - almost unbounded actor)
$\lambda=1.65$, $\beta_1 = 8$, $\beta_2 = 1$ (same EU - bounded actor, optimal perception changes, given the limits on the action channel)
$\lambda=0.4$, $\beta_1 = 1$, $\beta_2 = 10$ (same $I(X;W)$ for both models - bounded actor, optimal action changes, given the limits on the perceptual channel)
$\lambda=10$, $\beta_1 = 10$, $\beta_2 = 5$, mating-utility (optimal perception changes under a different utility function - compare against case above with identical parameters but different utility function)
Change the temperature on the action-channel (with very large resources on the observation channel) and see how the perception is affected by the action.
Change the temperature on the perception-channel (with very large resources on the action channel) and see how the action is affected by perception.
Change the utility function (mating-utility) and see how the percept is affected by the new utility.
#precision of the hand-crafted perceptual model
λ = 1.65 #0.4 #1.65
#the "price" for I(X;W) in the handcrafted case will be β1 (as given below)
#to make the two cases comparable. If you want I(X;W) to be the same in both
#cases, set β1 and run the sequential case and then tune λ until the mutual
#information terms are equal.
#
#Note that β2 will also be used for the bounded rational decision-maker that
#uses the hand-crafted perception. This should allow for easy comparison of
#the action-channels in both cases (hand-crafted vs. sequential case)
#inverse temperatures for sequential case
#β1: perceptual channel -> price for I(X;W)
#β2: action channel -> price for I(A;X)
#β3=0 sequential case (otherwise the general three-variable case is specified)
β1 = 8 #1
β2 = 10
β3 = 0;
Note that the simulation implements the soltuions of the general case. By setting $\beta_3=0$, the serial case is recovered (which corresponds to the setting in the paper). However, any of the other cases (parallel or general) can be simulated by simply setting the inverse temperatures accordingly (in the code-cell above)
The code cells below simulate a bounded-rational decision-maker using the hand-crafted perceptual model and visualize the solution.
#compute hand-crafted perception, then feed it into bounded rational decision-maker
#with inv. temperature β2 and compute optimal p(a|x) using hand-crafted p(x|w)
p_xgw_hc = pogw_handcrafted(x_values, w_values, λ)
numx = length(x_vec)
numa = length(a_vec)
#comptue expected utility under posterior over w given x
#
#posterior p(w|x)
p_wgx = zeros(numw, numx)
for k in 1:numx
p_wgx[:,k] = p_xgw_hc[k,:]' .* p_w
p_wgx[:,k] += eps() #add small epsilon to prevent numerical problems for λ>>
p_wgx[:,k] /= sum(p_wgx[:,k])
end
#U(a,x)=∑_w p(w|x)U(a,w)
U_ax = zeros(numa, numx)
for k in 1:numx
U_ax[:,k] = U_pre * p_wgx[:,k]
end
#p(x)=∑_w p(w)p(x|w)
p_x_hc = p_xgw_hc * p_w
#make sure that p(x) has non-zero entries (otherwise numerical problems arise in I(X;W))
p_x_hc += eps()
p_x_hc /= sum(p_x_hc)
#solve the decision-making part with one-step Blahut-Arimoto
ε = 0.0001 #convergence critetion for BAiterations
maxiter = 10000 #maximum number of BA iterations
#initialize p(a) uniformly
num_acts = length(a_vec)
pa_init = rand(num_acts)
pa_init /= sum(pa_init)
#BA iterations
p_agx_hc, p_a_hc, perf_df_hc = BAiterations(pa_init, β2, U_ax, p_x_hc, ε, maxiter,
compute_performance=true, performance_as_dataframe=true)
; #suppress output
#visualize solution p(x|w), p(a|x), p(a|w)
#p(x|w)
plt_pxgw_hc = visualizeBAconditional(p_xgw_hc, x_vec, w_vec, x_strings, w_strings,
wlabel="animal size w", alabel="observed size x", legendlabel="p(x|w,λ)")
#p(a|x)
plt_pa_hc, plt_pagx_hc = visualizeBAsolution(p_a_hc, p_agx_hc, a_vec, x_vec, a_strings, x_strings,
wlabel="observed size x", alabel="action a",
legendlabel_marginal="p(a)", legendlabel_conditional="p(a|x,λ)",
suppress_vis=true)
#compute p(a|w)
#p(a|w) = ∑_x p(x|w)p(a|x)
p_agw_hc = p_agx_hc * p_xgw_hc
plt_pagw_hc = visualizeBAconditional(p_agw_hc, a_vec, w_vec, a_strings, w_strings,
wlabel="animal size w", alabel="action a", legendlabel="p(a|w,λ)")
#visualize p(x|w)
display(plt_pxgw_hc)
#visualize p(a|x)
display(plt_pagx_hc)
#visualize p(a|w)
display(plt_pagw_hc)
The plots above show $p_\lambda(x|w)$, $p_\lambda(a|x)$ and the overall behavior $p_\lambda(a|w)$ respectively.
Below we compute the information terms of the hand-crafted solution and visualize them.
! Note that in the legends below $O$ is used instead of $X$
#compute p(a|x,w) = p(a|x) ∀w
p_agxw_hc = zeros(numa, numx, numw)
for j in 1:numw
p_agxw_hc[:,:,j] = p_agx_hc
end
#compute performance measures of solution
res_hc = analyze_three_var_BAsolution(p_w, p_x_hc, p_a_hc, p_xgw_hc, p_agx_hc, p_agxw_hc, p_agw_hc, U_pre, β1, β2, β3)
perf_df_hc = performancemeasures2DataFrame(res_hc...)
#visualize
p_MI, p_composed, p_perf = plot_three_var_performancemeasures(perf_df_hc, maximum(U_pre), β1, β2, β3)
display(hstack(p_MI, p_composed))
display(p_perf)
The code below simulates and visualizes a bounded optimal perception action system.
ε = 0.0001 #convergence critetion for BAiterations
maxiter = 10000 #maximum number of BA iterations
#This function performs Blahut-Arimoto iterations for the three-variable general case
px, pa, pxgw, pagx, pagxw, pagw, performance_df = threevarBAiterations(numx, β1, β2, β3, U_pre, p_w, ε, maxiter,
compute_performance=true, performance_as_dataframe=true,
init_pogw_sparse = true, init_pogw_uniformly = false,
init_pagow_uniformly = true);
#call the routine that creates plots for all the probability distributions
plt_px, plt_pa, plt_pxgw, plt_pagx, plt_pagw, dpdown, plt_pagow_vis = visualize_three_var_BAsolution(px, pa,
pxgw, pagx, pagxw, pagw,
x_vec, a_vec, w_vec,
x_strings, a_strings, w_strings,
olabel_string="x", alabel_string="a", wlabel_string="w")
#visualize p(x|w)
display(plt_pxgw)
#visualize p(a|x)
display(plt_pagx)
#visualize p(a|w)
display(plt_pagw)
The plots above show the bounded optimal solutions $p^*(x|w)$, $p^*(a|x)$ and the overall behavior $p^*(a|w)$ respectively.
Below we compute the information terms of the bounded-optimal solution and visualize them.
! Note that in the legends below $O$ is used instead of $X$
p_MI, p_composed, p_perf = plot_three_var_performancemeasures(performance_df, maximum(U_pre), β1, β2, β3)
display(hstack(p_MI, p_composed))
display(p_perf)
Rearange the plots and compare the most interesting quantities for the paper.
#compare both solutions
#compare performance measures, EU, J, I(A;W)
util_all = [perf_df_hc[end,:E_U], performance_df[end,:E_U],
perf_df_hc[end,:Objective_value], performance_df[end,:Objective_value]]
color_label = ["E[U]_hc", "E[U]", "J_hc", "J"]
colors = BAdiscretecolorscale(4).f(4)
colscale = Scale.color_discrete_manual(colors[1],colors[1],colors[end],colors[end])
p_perf_comp = plot(x=color_label, y=util_all, color=color_label, Geom.bar(),
Guide.ylabel("[utils]"), Guide.xlabel(""), Guide.colorkey(""),
Guide.title("λ=$λ vs. β1=$β1, β2=$β2, β3=$β3"),
colscale, Scale.y_continuous(minvalue=0, maxvalue=maximum(U_pre)),
BAtheme(key_position = :none))
MI_comp = [perf_df_hc[end,:I_ow], performance_df[end,:I_ow],
perf_df_hc[end,:I_ao], performance_df[end,:I_ao]]
color_label = ["I(X;W)_hc", "I(X;W)", "I(A;X)_hc", "I(A;X)"]
colors = Scale.color_discrete_hue().f(4)
colscale = Scale.color_discrete_manual(colors[1],colors[1],colors[2],colors[2])
p_perf_comp_MI = plot(x=color_label, y=MI_comp, color=color_label, Geom.bar(),
Guide.ylabel("[bits]"), Guide.xlabel(""), Guide.colorkey(""),
Guide.title("λ=$λ vs. β1=$β1, β2=$β2, β3=$β3"),
colscale, Scale.y_continuous(minvalue=0), BAtheme(key_position = :none))
display(hstack(p_perf_comp,p_perf_comp_MI))
#compare final behavior p(a|w) of both cases
display(plt_pagw_hc)
display(plt_pagw)
#compare perceptual model p(x|w) of both cases
display(plt_pxgw_hc)
display(plt_pxgw)
#generate plots for comparison
#display(plt_utility)
#draw(SVG("Figures/SequentialComparison_Utility.svg", 10cm, 8cm), plt_utility) #uncomment to store figure
#plot_final = vstack(hstack(plt_pogw_hc, plt_pagw_hc),hstack(plt_pogw, plt_pagw))
#plot_bars = (vstack(p_perf_comp,p_perf_comp_MI))
#display(plot_final)
#display(plot_bars)
#draw(SVG("Figures/SequentialComparison_3.svg", 18cm, 12cm), plot_final) #uncomment to store figure#
#draw(SVG("Figures/SequentialComparisonBars_3.svg", 8cm, 12cm), plot_bars) #uncomment to store figure