Notebook

Computational Essays on Artificial Intelligence¶

Evolving Robot Control¶

In this essay, we explore the Genetic Algorithm to evolve a Robot control network.

A chromosome will be composed of a series of genes, each representing a weight in a neural network
A population of chromosomes is generated randomly
For a number of generations the chromosomes are mutated by a small amount

1. Setup¶

Note: if running this notebook on Google Colab, you will need to install the required Python packages each time you start a session. You do not need to do this in Binder. Otherwise on your own computer, you only need to install the packages once.

To install the required Python packages:

In [1]:

%pip install aitk --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.

For this demonstration, we will need:

GeneticAlgorithm defined in aitk.algorithms.ga
World, Scribbler, RangeSensor from aitk.robots
SimpleNetwork from aitk.networks
a few other support functions

We import everything need in this cell:

In [2]:

from aitk.algorithms.ga import GeneticAlgorithm
from aitk.robots import World, Scribbler, RangeSensor
from aitk.networks import SimpleNetwork
from aitk.robots.utils import distance
import random

2. The Genetic Algorithm¶

The Genetic Algorithm (GA) is an approach inside the category of machine learning systems called Evolutionary Algorithms (EA). For this essay, we will use the GA as follows:

We will define a gene to be a floating-point number that represents the weight or bias of a Neural Network.
We will define a chromosome as a sequence of genes. The chromosome (sometimes called a geneotype) will represent all of the weights and biases of a network.
A population will be composed of a set of random chromosomes.

The length of the chromosome is determined by the size of the Neural Network. Therefore, we will call self.build_model() (defined below) and see how many weights it has.

We'll use a ring of 16 laser sensors as senses. We create a small simulated world (100 x 100) to allow the robot to move around.

The basic idea is that we will have a list of weights (the chromosome) that we load into the neural network. The sensors will be read, and propagated through the neural network. The output of the neural network will be interpreted as movement controls. We'll let the neural network "drive" the robot around the world starting from a few poses.

In [3]:

class NNGA(GeneticAlgorithm):
    """
    An example of using the GeneticAlgorithm class to 
    evolve the weights of a neural network to control 
    a simulated robot.
    """
    def __init__(self, popsize, sensors=16):
        self.show = False
        self.sensors = sensors
        self.world = World(100, 100)
        self.robot = Scribbler()
        # We add a ring of RangeSensors, 8cm from
        # the robot center, starting and ending 
        # at the back of the robot
        # with 75cm max range, and sensor width 0
        # which makes it a laser range finder
        self.robot.add_device_ring(
            RangeSensor, 8, -180, 180, 
            self.sensors, max=75, width=0)
        self.world.add_robot(self.robot)
        self.network = self.build_model()
        # Starting poses to test from:
        self.poses = [
            (20, 20, 0), 
            (20, 20, 45), 
            (20, 20, 90), 
            (20, 20, 180), 
            (20, 20, -45),
        ]
        # Length of a chromosome:
        length = len(self.network.get_weights(flat=True))
        super().__init__(length, popsize)

As mentioned, a chromosome will be a sequence of floating-point numbers, between -2 and +2. When we mutate such a gene, we will increase or decrease it a small amount. To define these two aspects of a gene, we define the following methods:

In [4]:

class NNGA(NNGA):        
    def make_random_gene(self):
        """
        Generate a random weight for the neural network.
        """
        # range of [-2, +2]
        return 2 - random.random() * 4
    
    def mutate_gene(self, gene):
        """
        Given a gene, mutate it a little.
        """
        # range of [-0.5, +0.5]
        return gene + (0.5 - random.random())

2.1 The Fitness Function¶

To use a GA we need to define a fitness function. A fitness function tests a chromosome in the simulated world, and returns a value that represents how "apt" (or fit) it is relative to the other chromosomes in the population at this generation.

Finding a good fitness function is difficult. We don't want to be too prescriptive, or the system will evolve exactly what we have defined it to do. On the other hand, we don't want to be too open, as that can leave the system to evolve solutions that don't really "solve a problem."

For example, let's say that we want the robot to move around this world and not bump into walls. We could define a fitness that is the total number of steps where the robot did not hit a wall (e.g., robot.stalled is not True).

Warning: the implementation of a GA that we use here requires that the fitness function return a number greater or equal to zero.

However, you would very quickly evolve a robot that doesn't move. No movement, no crashing into walls! Well, ok, then. How about we define a fitness function that rewards the robot when it keeps moving. The evolutionary system could "solve that problem" by simply spinning the robot in circles. Foiled again!

You could then define a fitness function that rewards the robot when it has forward movement, thereby circumventing the solution for the robot to spin in circles. However, the system could then just drive the robot in small circles, and we inadvently didn't reward a robot for moving backwards (which should be fine).

The search the human performs in attempting to find a good fitness function that will drive the robot into interesting control patterns while not settling on simplistic solutions is quite common. However, if you can train yourself to think more like nature, you can end up with some very clever robots.

To that end, we will define the fitness function to be the distance between where the robot started, and where it ended up after a given number of seconds. Plus, we will add all of the distances traveled on each individual step. However, if it ends up against a wall, it will get a score of zero.

This is a fairly easy fitness function to compute (the sum of all distance traveled, plus distance from where it started) and should bypass some of the solutions defined above.

In [5]:

class NNGA(NNGA):        
    def fitness(self, chromosome, index=None, poses=None, seconds=None, interrupt=True, real_time=False,
                show=False, show_progress=False, quiet=True):
        """
        Fitness is based on the distance the robot travels 
        from starting pose to ending pose.
        
        Args:
            poses: default None, all of the poses; 
                otherwise list of poses (x, y, a).
            seconds: default 10
            interrupt: default True
            real_time: default False
            show: default True
            show_progress: default False
            quiet: default True
        """
        if seconds is None:
            seconds = min(self.generation / 10, 10)
        self.show = show
        self.network.set_weights(chromosome)
        score = 0
        poses = poses if poses is not None else self.poses
        for pose in poses:
            self.robot.set_pose(*pose, show=self.show)
            self.world.seconds(
                seconds, self.controller, 
                interrupt=interrupt, 
                real_time=real_time, show=show,
                show_progress=show_progress, quiet=quiet)
            if not self.robot.stalled:
                end_pose = self.robot.get_pose()
                score += distance(pose[0], pose[1], end_pose[0], end_pose[1])
        return score   

We can now add some methods for the specific problem:

build_model - creates the neural network and takes in the number of sensors, and outputs the movement commands (one for translate, one for rotate)
controller - the method to drive the robot
get_sensors - a convenience method to get the sensor readings from the range sensors as a list
get_move - a method that takes the input to the network (the sensor readings) and propagates them through the network to get the output values, which are scaled to [-1, +1]

In [6]:

class NNGA(NNGA):        
    def build_model(self):
        """
        We build a simple network. 
        """
        return SimpleNetwork(
            self.sensors,
            4,
            2,
            activation="sigmoid"
        )
        
    def controller(self, world):
        """
        The controller for the robot.
        """
        sensors = self.get_sensors()
        output = self.get_move(sensors)
        self.robot.move(output[0], output[1])
        
    def get_sensors(self):
        """
        We return the sensors from the robot.
        """
        return [sensor.get_reading() 
                for sensor in self.robot]
    
    def get_move(self, sensors):
        """
        Given the dataset (a single )
        """
        # Propagate takes a single pattern:
        output = self.network.propagate(
            sensors, show=self.show)
        # Scale the output in [-1, 1]
        return 1 - output * 2
    

Now, we are ready to create a GA. We'll define the population size to be 30, which isn't too small, but big enough to evolve a solution in a few minutes.

In [7]:

ga = NNGA(popsize=30)

Random seed set to: 5594309
Genetic algorithm
  Chromosome length: 78
  Population size: 30

We can get a snapshot of the world with:

In [8]:

ga.world.display()

But we can do better than just a snapshot. We can actually dynamically "watch" the world update.

Warning: unfortunately the dynamic update may not work on all Jupyter Notebook systems. But it should work in: Jupyter Lab, Jupyter Notebook, Google Colab, and many others.

We'll create two watch windows: one for the world, and one for the network. This will update dynamically when show=True.

Idea: when running in Jupyter Lab, you can right-click on the two watch cells (after running them) and select "Create New View for Output" for each. You can then drag the panels over so that you can see them along side the notebook.

In [9]:

ga.world.watch()

Image(value=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\x08\x06\x0…

Likewise, we can get a snapshot of the network architecture:

In [10]:

ga.network.display()

But, again we can do even better by creating a dynamically updating view:

In [11]:

ga.network.watch()

HTML(value='<div style="outline: 5px solid #1976D2FF; width: 400px; height: 324px;"><svg id=\'keras-network\' …

2.2 Testing¶

Let's test everything to make sure it is working.

Let's see what a chromosome looks like:

In [12]:

len(ga.make_random_chromosome())

Out[12]:

Therefore a chromome will be a length of 78 "genes". Each gene is an individual weight or bias in the neural network.

Let's see what the sensor readings look like:

In [13]:

sensors = ga.get_sensors()
sensors

Out[13]:

[0.9499536896755313,
 1.0,
 0.4552718584361414,
 0.311920559388813,
 0.2801289369189201,
 0.3216590893285552,
 0.2616706002237469,
 0.18539009021787406,
 0.17573611964604702,
 0.22165291555883745,
 0.3874202432787764,
 0.8561860810940988,
 0.8668159170532664,
 1.0,
 1.0,
 0.9499536896755313]

And let's see what movements the sensors give when propagated through the network and scaled:

In [14]:

ga.get_move(sensors)

Out[14]:

array([0.18124181, 0.00042856], dtype=float32)

You should see the activations in the network display change. The robot won't actually move in the simulator until we update it.

Ok, let's put all of this together and move the robot around the simulated world. We can test out a random chromosome to see what this does.

We pass in:

a random chromosome (like one from the initial population)
A list of poses (x, y, angle) to test
number of seconds to run the simulation
Whether we should show the robot and network updates
Whether the simulation should run in real_time (setting to False will run as fast as possible)

Let's go!

In [15]:

ga.fitness(
    ga.make_random_chromosome(), 
    poses=[(50, 50, 0)], 
    seconds=5,
    show=True, 
    real_time=True)

Out[15]:

9.92026006976214

Interesting! You can try that cell over and over again to see many possible behaviors of these random weights.

Some questions you might be interested in answering at this point?

What percentage of random genes spin in circles?
What percentage of random genes change movement during the 5 seconds of training?
What percentage of gense actually make it to a wall?
What is the range of "fitness" values returned for these random chromosomes?

3. Evolve!¶

To run 10 generations (plus one to test initial population) for a population of size 30, in the 5 test poses for 3 seconds each will take 1 hour, 22 minutes, and 30 seconds of simulated time.

Often, 10 simulated generations can performed in a few actual clock minutes on a typical computer. One can get about a 50 x speed-up over real time when not watching the network, and about 10 x speed-up when watching the network.

You can run without watching by setting show=False.

In [16]:

%%time
ga.reset()
ga.world.reset()
bestFound = ga.evolve(
    generations=10, 
    crossover_rate=0.0, 
    mutation_rate=0.6, 
    elite_percent=0.05, 
    seconds=3,
    show=False,
) 

Using random seed: 5594309
  Maximum number of generations: 10
  Crossover rate: 0.0
  Mutation rate: 0.6
  Elite percentage 0.05
  Elite count: 1
Generation    0 Best fitness 125.27
Generation    1 Best fitness 125.27
Generation    2 Best fitness 125.27
Generation    3 Best fitness 151.33
Generation    4 Best fitness 152.44
Generation    5 Best fitness 154.87
Generation    6 Best fitness 169.32
Generation    7 Best fitness 169.32
Generation    8 Best fitness 169.32
Generation    9 Best fitness 170.97
Generation   10 Best fitness 171.21
Max generations reached
CPU times: user 1min 3s, sys: 293 ms, total: 1min 3s
Wall time: 1min 4s

4. Analysis¶

Here's a summary of how the best and average fitness for the population changed over time:

In [17]:

ga.plot_stats("NNGA")

Let's see the results by trying the best ever over the poses, for 5 seconds each:

In [18]:

ga.fitness(
    ga.bestEver, 
    real_time=True, 
    show=True, 
    seconds=5)

Out[18]:

175.6739394939202

Evolution may have created lots of behaviors that crash into walls.

What is the most interesting behavior that you observer after 10 generations?

Let's run a few more generations.

If you wish, you can increase the maximum generations, change any of the parameters, and continue to evolve:

In [ ]:

bestFound = ga.evolve(
    generations=20, 
    crossover_rate=0.6, 
    mutation_rate=0.001, 
    elite_percent=0.1, 
    seconds=3,
    show=False,
) 

  Maximum number of generations: 20
  Crossover rate: 0.6
  Mutation rate: 0.001
  Elite percentage 0.1
  Elite count: 3
Generation   11 Best fitness 171.21

In [ ]:

ga.plot_stats("NNGA")

In [ ]:

ga.fitness(
    ga.bestEver, 
    real_time=True, 
    show=True, 
    seconds=5,
)

Was the best from the second 10 generations significantly different from the best from the first 10 generations?

Did the fitness change dramatically during the second 10 generations?

Do you think there will be a point where the fitness won't get any higher?

5. Further Exploration¶

There is one additional aspect added to the fitness measure: if you don't specify the number of seconds to use explicitly, it will compute the number of seconds based on the generation. This allows the GA to focus on initial movements first, and longer control as the generations increase. You can test this effect by running without explicitly setting seconds.

Tip: the best fitness measure will increase with the increased simulation time. So you can't really compare the fitness in the begining generations with following generations when seconds is `None`.

Some ideas to try:

Try evolving with seconds=None. How does this compare with a fixed number of seconds?
What happens if you try with a larger, fixed number of seconds, such as 10?
The above experiments used 0.0 crossover with a high mutation rate. What happens if you try with a higher crossover rate (say 30%) and with a much smaller mutation rate?
Anything that you can represent with a list of numbers and create a fitness function for, you could conceivably evole. What else could you try? What are the limitations of the GA?

6. References¶

Turing
Holland

In [ ]: