Stabilising an Inverted Pendulum on a Cart

Examples - Mechanics

Last edited: April 20th 2019


Introduction

The inverted pendulum is a classic controllability example. Most likely, you have performed this kind of control yourself by balancing a broom or stick on your palm. In that case the pendulum was a physical pendulum and your arm the external driving force. In this notebook we will try to balance an inverted mathematical pendulum on a cart subjected by an external force acting on the cart's center of mass.

Figure 1: MIT Professor Alan V. Oppenheim in his lecture The Inverted Pendulum from the course Signals and Systems. Source: Oppenheim's lecture notes [1].

The figure above shows a common experimental setup for controlling an inverted pendulum. The pendulum is moved by pulling a suspended wire between the cart and an electrical motor. In this particular setup, the controller will balance the pendulum independently of the mass of the pendulum "bead", explaining the confidence when pouring more liquid into the glass.

Initially, we will model our pendulum system using Lagrangian mechanics and later control the pendulum using proportional control and then construct an algorithm to optimise the regulation based on quadratic weighing factors (linear quadratic regulator [LQR])

In [6]:
import numpy as np
import progressbar
import matplotlib.pyplot as plt
from matplotlib import animation
from IPython.display import display, Image
from scipy import signal
import control
%matplotlib inline

Theory Part I

Figure 2 below shows a schematic of the system used in this example. The inverted pendulum is fixed in one end on the top of a cart with a mass $M$. The rotation around the fixed point is assumed frictionless. The bead in the end of the pendulum has a mass $m$, while the stiff rod connecting the bead to the cart is massless. This makes the inverted pendulum a mathematical pendulum. The cart is able to move relative to the surface, i.e. only in the $x$-direction. Wheels are only included for illustration. The forces acting on the bead is the conservative gravitational force $\vec{G}$ in the negative $y$-direction, while the force acting on the cart is the external applied force $\vec{u}$ in the $x$-direction. The force $\vec{F}$ is an approximation of the friction on the entire system, also acting in the $x$-direction. The length of the massless rod is assigned $l$, and the angular displacement of the pendulum relative to $\vec{y}$ is given by $\theta$.

schematic</a>

Figure 2: A schematic on the inverted pendulum. The bead angle $\theta$ is zero at the unstable (upright) equilibrium. The gravitational force, $\vec{G}$, acting on the bead with mass $m$, the friction on the system $\vec{F}$ and external force $\vec{u}$ is illustrated.

In [7]:
figSizeX = 12
figSizeY = 9

# System parameters
m = 0.1          # mass of bead  [kg]
M = 1            # mass of cart  [kg]
l = 0.2          # length of rod [m]
g = 9.81         # gravitational acceleration [m s^-2]
mu = 10          # friction coefficient [kg s^-1]

# Initial conditions of generalised coordinates
x0_0 = 0         # cart position [m]
x1_0 = 0         # rod angle [radians]
x2_0 = 0         # cart velocity [m s^-1]
x3_0 = 0         # rod angular velocity [radians s^-1]
u_0 = 0          # initial cart force  [kg m s^-2]

The system parameters above can be altered to get different motion and response of the system. As example, a more massive bead would require more external force to be balanced, and a longer rod would reduce the pendulum swing period.

Equations of motion

It is possible to derive the equations of motion using Newtonian forces, but the Lagrangian formulation is more elegant. A full derivation of the equations from the Lagrangian $$ L = T - U $$ is found in appendix A.

The system sketched in figure 2 has two holonomic constraints: The rod length $l$ is fixed, and the $y$-coordinate of the cart is always zero. This results in two generalised coordinates: The bead angle $\theta$, the cart position $x$ and their time derivatives $\dot{\theta}$ and $\dot{x}$. The corresponding equations of motion evaluates to

\begin{equation} \ddot{x} = \lambda(\theta)\big[mgl^2\dot{\theta}^2\sin\theta - mgl \sin\theta \cos\theta - l\mu \dot{x} + ul \big] \label{cartacc} \end{equation}\begin{equation} \ddot{\theta} = \lambda(\theta)\big[(m + M)g \sin\theta - mgl\dot{\theta}^2 \sin\theta \cos\theta + \mu\dot{x}\cos\theta - u\cos\theta \big] \label{angelacc} \end{equation}

where

\begin{equation} \lambda(\theta) = \frac{1}{l(M + m\sin^2\theta)} \label{lambda} \text{.} \end{equation}

The friction coefficient $\mu$ originates from the friction force $\vec{F}$, which is set to be proportional to the cart velocity, $F = -\mu\dot{x}$.

By renaming the generalized coordinates we get the state vector:

\begin{equation} \textbf{x} = \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} \text{.} \end{equation}

Then, by realising that $\dot{x_0} = x_2$ and $\dot{x_1} = x_3$, we get the state equation:

\begin{equation} \dot{\textbf{x}} = \frac{d}{dt} \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_2 \\ x_3 \\ \lambda(x_1)\big[ml^2{x_3}^2\sin x_1 - mgl \sin x_1 \cos x_1 - l\mu x_2 + ul \big] \\ \lambda(x_1)\big[(m + M)g \sin x_1 - ml{x_3}^2 \sin x_1 \cos x_1 + \mu x_2\cos x_1 - u\cos x_1 \big] \end{bmatrix} \text{.} \label{stateEquation} \end{equation}

The last two equations in the state vector are non-linear. Which means that the motion of the system is very hard to regulate. Consequently, we will have to linearise them. This is done in part II of the notebook.

Implementation - Part I

Simulating the system

To simulate the inverted pendulum on a cart we have to solve the state equation iteratively. In this notebook we will use the numerical method Runge-Kutta fourth order (RK4) to integrate the system of non-linear differential equations, \eqref{systemEquation}. If you have never heard of this method, or want a recap, check out this notebook for a practical introduction.

In addition to defining the RK4 step function, we define a function that returns the right hand side of \eqref{stateEquation} given a vector $\textbf{x}$ and the system parameters, and a function to calculate the systems potential and kinetic energy.

In [20]:
def RHS(z, RHSargs): 
        """Given a specified state vector and system parameters,
        z and RHSargs respectively, this function returns the 
        state vector element's time derivative, i.e. the right 
        hand side of the state equations.
        Parameters:
            z        4x1 array
            RHSargs  6x1 array
        Returns:
            dxdt     4x1 array
        """
        m, M, l, g, mu, u = RHSargs
        x0, x1, x2, x3 = z
        dx0dt = x2
        dx1dt = x3
        lambdax1 = 1/(l*(M + m*np.sin(x1)**2))
        
        dx2dt = lambdax1*(m*l*l*x3**2*np.sin(x1) - m*g*l*np.sin(x1)*np.cos(x1)
                            - l*mu*x2  + u*l)
        dx3dt = lambdax1*((m + M)*g*np.sin(x1) - m*l*x3*x3*np.sin(x1)*np.cos(x1)
                            + mu*x2*np.cos(x1) - u*np.cos(x1))
        
        dxdt = np.array([dx0dt, dx1dt, dx2dt, dx3dt])
                               
        return dxdt
                               

def RK4Step(z, dt, RHS, RHSargs):
        """ Performs one step of the 4th order Runge-Kutta method
        z is the current state vector, dt is the numerical time
        step, RHS is a function for the right hand side of the
        system of differential equations, and RHSargs is the input
        parameters for that function. RK4step returns the next
        state vector, i.e. the values of z after a time dt.
        Parameters:
            z         4x1 array
            dt        float
            RHS       function
            RHSargs   6x1 array
        Returns:
            The state vactor, z, after a time, dt.
        """
        k1 = np.array(RHS(z, RHSargs))
        k2 = np.array(RHS(z + k1*dt/2, RHSargs))
        k3 = np.array(RHS(z + k2*dt/2, RHSargs))
        k4 = np.array(RHS(z + k3*dt, RHSargs))
        
        return z + dt/6*(k1 + 2*k2 + 2*k3 + k4)
In [21]:
def addEnergy(z):
    """ Given the the state vector, z, calculates and returns the kinetic
    and potential energy of the system in an arrray. See appendix B for the
    energy term formulas.
    """
    x0, x1, x2, x3 = z
    U = m*g*l*np.cos(x1)
    T = 0.5*x2**2*(m + M) + 0.5*m*x3**2*l**2 + m*x2*x3*l*np.cos(x1)
    return np.array([T, U])

Now, we can integrate the state equation after defining a step size, $\Delta t$, and the integration duration $t_{max}$. The matrix Z is initialised to contain the state vector at any time $t$. While the initial angle of the pendulum is zero, it will never fall down because the x-component to the gravitational force is excactly zero. So, to make the situation more realistic, a random noise is added as external force, $u$, in every iteration. The noise could correspond to e.g. a wind gust (isolated on the cart).

In [26]:
duration = 10                # Integration duration [sec]
timeStep = 1e-2              # Time step 
n = int(duration/timeStep)   # Number of datapoints

# Initialising matrices
RHSargs = np.array([m, M, l, g, mu, u_0]) # System parameters
Z = np.zeros((4, n))                      # Z = [[x0_0, ... x0_n], [x1_0, ... x1_n], [x2,...,], [x3,...,]
# Inserting initial state vector into Z
Z[:, 0] = np.array([x0_0, x1_0, x2_0, x3_0])
# Initialising matrix containing the noise and potential force on the cart
extForce = np.zeros((2, n))
extForce[0, 0] = u_0

# Initialising matrix containing the kinetic and potential energy of the system
energy = np.zeros((2, n))    # energy = [[T_0 ... T_n], [U_0, ... U_n]]
energy[:, 0] = addEnergy([x0_0, x1_0, x2_0, x3_0])

bar = progressbar.ProgressBar()
for i in bar(range(n - 1)):
    # Compute and insert the next state vector using RK4
    Z[:, i + 1] = RK4Step(Z[:, i], timeStep, RHS, RHSargs)
    # Generating a random noise value
    noise = np.random.uniform(-0.001, 0.001)
    extForce[1, i +1] = noise
    # Updating the system parameter u
    RHSargs[5] = noise
    # Add the energy terms
    energy[:, i + 1] = addEnergy(Z[:, i + 1])
100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

Next, we make a phase-space plot of the generalised coordinates in addition to the evolution to the state vector. The definition of the plotting function is found in Appendix B.

In [27]:
plotData(Z, extForce, energy, appliedForce=False)

The phase-space plot of $\theta$ reveals a movement similar to a simple pendulum with friction. It spirals in towards zero angular velocity and an angle $\theta = \pi$. This point would correspond to the pendulum hanging downward at rest, which is the intuitive result if no force tries to maintain the inverted form. The two lowermost plots shows how the total energy of the system reduces with time. This is due to the friction force ($\vec{F}$ in figure 2) working in the opposite direction of the cart's velocity. The bigger the friction coefficient $\mu$, the higher the energy loss. As seen from the time evolution plots, the pendulum stays inverted some time before noise pushes it over. The phase-space path for $q_2(x, \dot{x})$ however has a not so harmonic motion. It seems to experience a high acceleration for a short time two times each periodic motion. Let's make an animation to see better whats going on. The definition of the animation function is found in Appendix C. The animation below was generated using the code:

generateAnimationWithPhaseSpace(Z, 'falling_phsp.gif')

falling

As seen from the animation, the cart slides back and forth due to the pendulum.

Now, in our state equation \eqref{stateEquation} (equations of motion) we have the variable $u$ which let us exert a external force on the cart. In the next part of the notebook we will utilise this variable to try to stabilise the pendulum. But first, the state equation has to be linearised.

Theory Part II

Linearising around equilibrium

The non-linear equations of motion can be linearised to the form

\begin{equation} \dot{\textbf{x}} = \mathcal{A}\textbf{x} + \mathcal{B}u \end{equation}

around an equilibrium. The inverted pendulum in our example has two equilibrium points. The stable equilibrium, corresponding to a downward hanging pendulum

\begin{equation} \begin{cases} x_0 &= \mathrm{free} \\ x_1 &= \pi \\ x_2 &= 0 \\ x_3 &= 0 \end{cases} . \end{equation}

And the unstable equilibrium, corresponding to the pendulum being carefully balanced upright. It's unstable in the sense that any perturbation from its point, for example noise in the form of a small wind gust, will cause it to fall down.

\begin{equation} \begin{cases} x_0 &= \mathrm{free} \\ x_1 &= 0 \\ x_2 &= 0 \\ x_3 &= 0 \end{cases} . \end{equation}

The latter is the equilibrium we are interested in.

The matrices $\mathcal{A}$ and $\mathcal{B}$ is given by the Jacobian Matrix,

\begin{equation} \mathcal{J} = \big [ \frac{\partial \textbf{f}}{\partial x_1} \cdots \frac{\partial \textbf{f}}{\partial x_n} \big ] = \begin{bmatrix} \frac{\partial f_0}{\partial x_0} & \cdots & \frac{\partial f_0}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial x_0} & \cdots & \frac{\partial f_n}{\partial x_n} \end{bmatrix} \end{equation}

differentiated with respect to the variables $x$ and $u$, respectively [2]. The derivatives are then evaluated at an equilibrium point. Here $f_i$ is the components of the state equation $\eqref{stateEquation}$

\begin{equation} \dot{\textbf{x}} = \frac{d}{dt} \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} f_0 \\ f_1 \\ f_2 \\ f_3 \end{bmatrix} = \begin{bmatrix} x_2 \\ x_3 \\ \lambda(x_1)\big[ml^2{x_3}^2\sin x_1 - mgl \sin x_1 \cos x_1 - l\mu x_2 + ul \big] \\ \lambda(x_1)\big[(m + M)g \sin x_1 - ml{x_3}^2 \sin x_1 \cos x_1 + \mu x_2\cos x_1 - u\cos x_1 \big] \end{bmatrix} \text{.} \end{equation}

When calculating the Jacobian matrix at the unstable equilibrium, every coordinate except $x_0$ goes to zero after differentiating. This gives

\begin{equation} \mathcal{A} = \begin{bmatrix} \frac{\partial f_0}{\partial x_0} \big |_{0} & \frac{\partial f_0}{\partial x_1} \big |_{0} & \frac{\partial f_0}{\partial x_2} \big |_{0} & \frac{\partial f_0}{\partial x_3} \big |_{0} \\ \frac{\partial f_1}{\partial x_0} \big |_{0} & \frac{\partial f_1}{\partial x_1} \big |_{0} & \frac{\partial f_1}{\partial x_2} \big |_{0} & \frac{\partial f_1}{\partial x_3} \big |_{0} \\ \frac{\partial f_2}{\partial x_0} \big |_{0} & \frac{\partial f_2}{\partial x_1} \big |_{0} & \frac{\partial f_2}{\partial x_2} \big |_{0} & \frac{\partial f_3}{\partial x_3} \big |_{0} \\ \frac{\partial f_3}{\partial x_0} \big |_{0} & \frac{\partial f_3}{\partial x_1} \big |_{0} & \frac{\partial f_3}{\partial x_2} \big |_{0} & \frac{\partial f_3}{\partial x_3} \big |_{0} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & \frac{-mg}{M} & \frac{-\mu}{M} & 0 \\ 0 & \frac{g(m + M)}{lM} & \frac{\mu}{ML} & 0 \end{bmatrix} \end{equation}

and

\begin{equation} \mathcal{B} = \begin{bmatrix} \frac{\partial f_0}{\partial u} \big |_{0} \\ \frac{\partial f_1}{\partial u} \big |_{0} \\ \frac{\partial f_2}{\partial u} \big |_{0} \\ \frac{\partial f_3}{\partial u} \big |_{0} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{M} \\ \frac{-1}{Ml} \end{bmatrix} \text{.} \end{equation}

One could perform a test to show that the system is controllable with the matrices $\mathcal{A}$ and $\mathcal{B}$, but in our case we will assume it is.

Now, by defining the controller

$$ u = -\mathcal{K}\textbf{x}, $$

we get the linearised form

\begin{equation} \dot{\textbf{x}} = (\mathcal{A} - \mathcal{BK})\textbf{x}, \label{controller} \end{equation}

where $\mathcal{K}$ is the gain matrix. It is a proportional gain as the control variable (the external force $u$) is proportional to the state vector $\text{bf}$. This allows us to choose a $\mathcal{K}$ such that the eigenvalues of the matrix in the parenthesis are all stable, in essence negative (which indicate negative feedback) [1, 3].

Implementation Part II

Proportional gian controller

In this part we do a similar implementation as above, in addition to calculate the gain matrix. Then, by adding the control force in each integration step, a feedback loop is created. To retrieve a $\mathcal{K}$ matrix such that the eigenvalues of $(\mathcal{A} - \mathcal{BK})$ becomes the wanted eigenvalues poles, scipy's signal.place_poles package is utilised. This package simply returns the gain matrix given the poles, $\mathcal{A}$ and $\mathcal{B}$.

In [33]:
# Redefining the initial rod angle in such a way 
# that it starts out by falling
x1_0 = 0.2

# Defining the setpoints. I.e. the values of the 
# state vector needed for stabilasion
# x0 can be choosed freely, we choose -0.2
setpoints = [-0.2,   
                0,
                0,
                0]

# Defining the control matrices A and B
A = np.array([[0, 0, 1, 0],
            [0, 0, 0, 1],
            [0, -m*g/M, -mu/M, 0],
            [0, (m+M)*g/(M*l), mu/(M*l), 0]])
                    
B = np.array([[0], 
            [0], 
            [1/M],
            [-1/(M*l)]])

# Specifying wanted eigenvalues of the matrix (A - B*K)
# Subscript PG stands for Proportional Gain
poles_PG = [-1.3, -1.4, -1.5, -1.6]
# Get the full state feedback bunch object
fsf = signal.place_poles(A, B, poles_PG)
# Extract the gain matrix
K_PG = fsf.gain_matrix

def controller(z, K):
    """ Given the current values of the state vactor, z
    and the gain matrix K, it returns the current control
    input, u.
    """
    offset = z - setpoints
    u = -np.dot(K, offset)
    return u[0]


def integrateControlled(K, pushForce, pushTime):
    """ This is the integration process implemented earlier,
    but with controll input added to the force. The possibility
    for a 'push' on the pendulum is also added.
    """
    # Re-initialising matrices
    RHSargs = np.array([m, M, l, g, mu, u_0])
    # Z = [(x0_0, x0_1, ..., x0_n), (x1_0, x1_1, ..., x1_n), (x2, ...,), (x3, ...,)]
    Z = np.zeros((4, n))
    Z[:, 0] = np.array([x0_0, x1_0, x2_0, x3_0])
    # Array containing the external forces at each time step
    extForce = np.zeros((2, n))
    extForce[0, 0] = u_0

    # Initialising matrix containing the kinetic and potential energy of the system
    energy = np.zeros((2, n))    # energy = [[T_0 ... T_n], [U_0, ... U_n]]
    energy[:, 0] = addEnergy([x0_0, x1_0, x2_0, x3_0])

    bar = progressbar.ProgressBar()
    for i in bar(range(n-1)):
        Z[:, i + 1] = RK4Step(Z[:, i], timeStep, RHS, RHSargs)
        
        # "Push"
        #  Non-continuously increasing/reducing the angular velocity
        if i == pushTime and i>0:
            Z[3, i + 1] += pushForce
            print("Pushed!")
        
        noise = np.random.uniform(-0.01, 0.01)
        extForce[1, i +1] = noise
        # Calculate the control variable
        force = controller(Z[:, i + 1], K)
        # Store the force, u
        extForce[0, i + 1] = force
        # Adding the total force and updating the parameters
        f = noise + force
        RHSargs[5] = f
        # Adding energy terms
        energy[:, i + 1] = addEnergy(Z[:, i + 1])
        
    return Z, extForce, energy

Z_PG, extForce_PG, energy_PG = integrateControlled(K_PG, 0, 0)
100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00
In [34]:
plotData(Z_PG, extForce_PG, energy_PG, appliedForce=True)