Using bus engine replacement model as an analogy, in this exercise you’ll continue developing the inventory model from the first lecture on dynamic programming (video 27), and extend it to the infinite horizon case with stochastic demand.
Recall the inventory management problem in discrete time and finite horizon $ t=0,\dots,T $
The notation is:
The sales in period $ t $ are given by $ s_t = \min\{x_t,d_t\} $.
Inventory to be stored till next period is given by $ k_t = \max\{x_t-d_t,0\} + q_t = x_{t+1} $.
The profit in period $ t $ is given by
$$ \begin{eqnarray} \pi_t & = & p \cdot \text{sales}_t - r \cdot x_{t+1} - c \cdot (\text{order made in period }t) \\ & = & s_t p - k_t r - c \mathbb{1}\{q_t>0\} \end{eqnarray} $$Assuming all $ q_t \ge 0 $, let $ \sigma = \{q_t\}_{t=1,\dots,T} $ denote a feasible inventory policy.
If $ d_t $ is stochastic the policy becomes a function of the period $ t $ inventory $ x_t $.
The expected profit maximizing problem is given by
$$ {\max}_{\sigma} \mathbb{E}\Big[ \sum_{t=0}^{T} \beta^t \pi_t \Big], $$where $ \beta $ is discount factor.
The expectation in the Bellman equation is taken over the distribution of the next period demand $ d_{t+1} $, which according to the problem is independent of any other variables (idiosyncratic), thus the conditioning on $ (x_t,d_t,q_t) $ can be meaningfully dropped. Expectation can be written as an integral over the distribution of demand $ F(d) $.
$$ V(x_t,d_t) = \max_{q_t \ge 0} \Big\{ s_t p - k_t r - c \mathbb{1}\{q_t>0\} + \beta \int V\big( k_t, d \big) \partial F(d) \Big\} $$Because we’ll be solving the problem in infinite horizon, the time subscripts can be dropped, and we can just have current period variables $ x,d,q,s,k $, and next period variables denoted by prime, i.e. $ x' $
The Bellman equation is then
$$ \begin{eqnarray} V(x,d) &=& \max_{q \ge 0} \Big\{ \pi + \beta \mathbb{E}\Big[ V\big(x', d' \big) \Big| x,d,q \Big] \Big\} \\ &=& \max_{q \ge 0} \Big\{ s\cdot p - k\cdot r - c \mathbb{1}\{q>0\} + \beta \mathbb{E}\Big[ V\big( k, d' \big) \Big] \Big\} \end{eqnarray} $$$$ \begin{eqnarray} s &=& \min\{x,d\} \\ k &=& \max\{x-d,0\} + q \end{eqnarray} $$Note that similar to the bus engine replacement model, the inventory model features random variable which distribution does not depend on the previous period variables (it is idiosyncratic).
In this case it is possible to reduce the dimensionality of the fixed point problem by rewriting the Bellman operator in expected value function terms.
$$ EV(x') = \mathbb{E}\Big[ V\big(x', d' \big) \Big| x,d,q \Big] = \mathbb{E}\Big[ V\big(x', d' \big) \Big], $$where the expectation is taken over the distribution of the next period demand $ d' $. The conditioning on $ x,d,q $ can be dropped exactly because $ d' $ is idiosyncratic.
We can then write the Bellman equation as
$$ V(x,d) = \max_{q \ge 0} \Big\{ s\cdot p - k\cdot r - c \mathbb{1}\{q>0\} + \beta EV(k) \Big\} $$$$ V(x,d) = \max_{q \ge 0} \Big\{ p \min\{x,d\} - r ( \max\{x-d,0\} + q ) - c \mathbb{1}\{q>0\} + \beta EV(\max\{x-d,0\} + q) \Big\} $$Taking the expectation with respect to $ d $ on both sides, we get
$$ EV(x) = \mathbb{E}\Big[ \max_{q \ge 0} \Big\{ p \min\{x,d\} - r ( \max\{x-d,0\} + q ) - c \mathbb{1}\{q>0\} + \beta EV(\max\{x-d,0\} + q) \Big\} \Big] $$By assumption the inventory is discrete, and so it is natural to assume that the demand is also represented as a discrete random variable. Then the expectation can be written as a sum weighted with the corresponding probabilities $ pr(d) $, as
$$ EV(x) = \sum_{d} \Big[ \max_{q \ge 0} \Big\{ p \min\{x,d\} - r ( \max\{x-d,0\} + q ) - c \mathbb{1}\{q>0\} + \beta EV(\max\{x-d,0\} + q) \Big\} \Big] pr(d) $$This is functional equation in $ EV $ which is also a contraction mapping!
Assume that $ d(t) $ is stochastic and follows geometric distribution with the support $ k \in \{0,1,2,\dots\} $ and corresponding probabilities $ P(k)=(1-p)^k p $, where $ p $ is a fixed parameter.
# your code here