1 Tutorial: Continuous GA
In this tutorial, we will implement a continuous GA based on the binary GA that we implemented in the previous works. A lot of the binary GA code can be reused as is or just slightly modified, and also some code related to the binary encoding can simply be removed. Therefore, this tutorial will necessarily contain much of the same information as the tutorial on the binary GA.
To keep things simple, we will convert the binary GA that was able to learn strings into a continuous GA for doing the same thing. After having completed the tutorial, you should try to modify it so that it can be used to solve and optimise test functions such as those given in the appendix of ?, which is available on Fronter.
1.1 A continuous GA for string learning
The purpose of this GA is to discover, or learn, a target string. To learn the target string, the GA will contain a population of candidate solutions (chromosomes) that represent strings. Each chromosome is a guess and can be evaluated and assigned a cost, where the cost should relate to the difference between the string it represents and the target string.
The GA can then evolve the population over several generations until one or more of the individuals in the population become the target string. Such individuals will have a cost of zero.
Note that this example is adapted from (and is slightly easier than) the song learning example in Chapter 4 of Haupt & Haupt, where a GA is used to learn the song “Mary had a little lamb.”
We begin by creating a module ContinuousGA.hs
for our code:
1module ContinuousGA where
1.2 GA settings
Before we proceed, it is useful to add some GA parameter settings to the top of our file so that they can easily be found and tweaked when we will test the GA later. We adopt the terminology used in class and in Haupt & Haupt and define the following parameters with some default settings:
1numPop = 40 :: Int -- Population size (number of chromosomes)
2xRate = 0.5 :: Double -- Selection rate
3mutRate = 0.1 :: Double -- Mutation rate
4numElite = 1 :: Int -- Number of elite chromosomes
5itMax = 1000 :: Int -- Max number of iterations
1.3 Alphabet
The alphabet
is the set of possible symbols, or characters, that can constitute a string. We store the
alphabet
as a string with 48 characters
1alphabet’ = ~abcdefghijklmnopqrstuvwxyz0123456789.,;:?!_+-*/ ~ :: String
Please note that the final character ~~ in alphabet’
is the space character and must not be
removed. Because we are using a continuous (real-valued) GA, we do not have to worry
about binary encoding and padding the alphabet to make it have a length that is a power of
2.
1.4 Target strings
For testing and debugging purposes, we define a set of test strings:
1s1, s2, s3, s4, s5, s6 :: String
2s1 = ~h~
3s2 = ~abc~
4s3 = ~hello world!~
5s4 = ~abc123.,;~
6s5 = ~descartes: cogito ergo sum~
7s6 = ~2+2 is: 4, 2*2 is: 4; why is _/not/_ 2.2*2.2 equal to 4.4?!~
8target = s2
1.5 Genes, chromosomes, and the population
In the following, we will assume that a gene is a continuous real-valued variable normalised to the range and encodes a single character, and that a chromosome is a list of genes that encodes a string.
The number of characters in the target string is sometimes called the number of (encoding) variables,
numVar
:
1numVar = length target :: Int -- Number of characters (genes) in target
This number tells us how many genes are required in each chromosome.
Next, we need to know the number of chromosomes numKeep
to keep between generations. This
number is a fraction (the selection rate xRate
) of the population size numPop
and should be rounded
up to the nearest even number to simplify the mating procedure:
1numKeep | numKeep’’ > numPop = numPop - numPop ‘mod‘ 2
2 | numKeep’’ < 2 = 2
3 | otherwise = numKeep’’
4 where numKeep’’ = numKeep’ + numKeep’ ‘mod‘ 2
5 numKeep’ = ceiling $ xRate * fromIntegral numPop :: Int
For example, for numPop = 40
and xRate = 0.5
, we get numKeep = 20
. Some cases are introduced
for safety to ensure numKeep
is always greater than or equal to 2, and always smaller than or equal to
numPop
.
We also need to determine the number of chromosomes to mutate, numMut
, which is a fraction (the
mutation rate mutRate
) of the population size, and must be rounded up to nearest integer:
1numMut = ceiling $ mutRate * fromIntegral numPop :: Int
For numPop = 40
and mutRate = 0.1
, we get numMut = 4
.
Finally, we define some type aliases for readability and debugging purposes:
1type Gene = Double
2type Chromosome = [Gene]
3type Population = [Chromosome]
First of all, note that we possibly could have used an integer type such as Int
for Gene
, however, we
want our GA to be general purpose and able to handle other problems where the variables will not be
characters that can be represented by integers but real-valued numbers, e.g., to optimise the test
functions in the appendix in ?.
Exercise: Experiment with different alphabets and different values for numPop
, xRate
, and mutRate
and verify that numGene
, numKeep
, and numMut
are calculated correctly by testing in the
ghci
.
1.6 Normalisation and denormalisation
The plan is to normalise the genes values to ,
however, the characters in the alphabet can be represented as integers that corresponds to their
position (index) in the alphabet. Hence, we need a normalise
function to convert variable
, which is an integer
in the range , to a
real-valued gene
in the range ,
and likewise, a denormalise
function to convert back from a gene to a variable value.
We begin by defining the constraints on variable values and gene values:
1-- Variable constraints
2pLo = 0.0 :: Double
3pHi = fromIntegral $ length alphabet - 1 :: Double
4
5-- Gene constraints
6gLo = 0.0 :: Double
7gHi = 1.0 :: Double
To normalise a variable in the range to a range , we can use the following formula:
(1) |
which simplifies to
(2) |
for and .
The normalise
function becomes
1-- Normalise a variable to range [gLo, gHi] with safety checks
2normalise :: Double -> Double
3normalise p = p’
4 where p’ | normalise’ p < gLo = gLo
5 | normalise’ p > gHi = gHi
6 | otherwise = normalise’ p
7
8-- Normalise a variable to range [gLo, gHi], unsafe
9normalise’ :: Double -> Double
10normalise’ p = pnorm * (gHi - gLo) + gLo
11 where pnorm = pLo + (p - pLo) / (pHi - pLo)
where we add some cases for safety to ensure that the normalised variable
stays within
the limits of .
The denormalise
function becomes
1denormalise :: Double -> Double
2denormalise pNorm = pNorm * (pHi - pLo) + pLo
1.7 Encoding functions
We need a function encodeString
to encode a string as a Chromosome
. This means that if we have
another function encodeChar
that can encode a single character as a Gene
, we can just map that
function over a string, which is a list of characters, in order to obtain the encoded Chromosome
. In
addition, it will likely prove useful to have a function encodeStringList
that can encode a list of
strings as a Population
.
1.7.1 The encodeChar
function
Let us begin with defining the encodechar
function:
1-- Encode a single character as a gene
2encodeChar :: Char -> Gene
3encodeChar c = normalise $ fromIntegral $ stripMaybe index
4 where index = elemIndex c alphabet
5 stripMaybe (Just index) = index
6 stripMaybe (Nothing) = error $ ~elemIndex returned Nothing. ~ ++
7 ~A character is not in the alphabet!~
Given a character c
, the function should return a normalised gene that corresponds to the character’s
index in alphabet
.
A convenient function to determine the index of an element in a list is elemIndex
, which is part of the
Data.List
library:
1import Data.List (elemIndex)
Its type signature is given by elemIndex :: Eq a => a -> [a] -> Maybe Int
Therefore, we will need to “strip” the Just
from the return value. which is done by the helper function
stripMaybe
.
We can test the encodeChar
function in the interpreter:
1*ContinuousGA> encodeChar ’a’
20.0
3*ContinuousGA> encodeChar ’b’
42.127659574468085e-2
5*ContinuousGA> encodeChar ’/’
60.9787234042553191
1.7.2 The encodeString
function
The encodeString
function is simple enough:
1encodeString :: String -> Chromosome
2encodeString = map encodeChar
We just map encodechar
over a list of characters (a string) to obtain a list of encoded genes, or a
Chromosome
.
We can test the function in ghci
:
1*ContinuousGA> encodeString ~abc~
2[0.0,2.127659574468085e-2,4.25531914893617e-2]
3*ContinuousGA> encodeString ~*/ ~
4[0.9574468085106383,0.9787234042553191,1.0]
1.7.3 The encodeStringList
function
The encodeStringList
function is just as simple:
1encodeStringList :: [String] -> Population
2encodeStringList = map encodeString
The output in ghci
for a list of two test strings yields
1*ContinuousGA> encodeStringList [~ab~,~*/~]
2[[0.0,2.127659574468085e-2],[0.9574468085106383,0.9787234042553191]]
Exercise: What happens if one of the encoding functions above encounter a character not in
alphabet
?
1.8 Decoding functions
In addition to encoding functions, we also need decoding functions able to convert back from
normalised genes to characters in alphabet
.
1.8.1 The decodeGene
function
we define a decodeGene
function that converts a Gene
to an index (a decimal number)
and then accesses the character at the index position in alphabet
using the !!
function:
1decodeGene :: Gene -> Char
2decodeGene g = alphabet!!idx
3 where idx = round $ denormalise g
Example usage in ghci
:
1*ContinuousGA> decodeGene 0.0
2’a’
3*ContinuousGA> decodeGene 0.01
4’a’
5*ContinuousGA> decodeGene 0.02
6’b’
7*ContinuousGA> decodeGene 0.98
8’/’
9*ContinuousGA> decodeGene 0.99
10’ ’
11*ContinuousGA> decodeGene 1.0
12’ ’
Exercise: What happens if a gene is outside the range ?
1.8.2 The decodeChromosome
function
To decode a Chromosome
, we just map the decodeGene
function over it to obtain the corresponding
string of characters:
1decodeChromosome :: Chromosome -> String
2decodeChromosome = map decodeGene
Here is an example in ghci
:
1*ContinuousGA> let band = encodeString ~ac/dc~
2*ContinuousGA> band
3[0.0,4.25531914893617e-2,0.9787234042553191,6.382978723404255e-2,4.25531914893617e-2]
4*ContinuousGA> decodeChromosome band
5~ac/dc~
1.8.3 The decodePopulation
function
It is also convenient to have a function able to decode an entire Population
of chrosomoses to a list of
decoded strings. For this, we just map the decodeChromosome
function over the population list of
chromosomes:
1decodePopulation :: Population -> [String]
2decodePopulation = map decodeChromosome
Example functionality in ghci
:
1*ContinuousGA> let pop = encodeStringList [~ac/dc~,~heavy~,~rock!~]
2*ContinuousGA> pop
3[[0.0,4.25531914893617e-2, ...],
4 [0.14893617021276595, ...],
5 [0.3617021276595745, ...]]
6*ContinuousGA> decodePopulation pop
7[~ac/dc~,~heavy~,~rock!~]
1.9 Randomness functions
Randomness is vital for a GA, e.g., to create a random initial population, a random crossover
point, a random mutation, and so on. We will use two functions from the System.Random
library to implement the functions we need for randomness, namely StdGen
and randomR
:
1import System.Random (StdGen, randomR)
It will be convenient to have functions for generating random genes, chromosomes, and populations;
and also for generating a random index (e.g., a crossover point) in a list.
Note that the reason we also return a StdGen
in many or most of the functions dealing with
randomness is so that we can repeatedly apply them, e.g., in a genetic evolution. In such cases, we
need to pass on a StdGen
for the next function calls.
1.9.1 The randGene
function
The randGene
function generates a random number of type Gene
in the range
:
1randGene :: StdGen -> (Gene, StdGen)
2randGene g = (value, g’)
3 where (value, g’) = randomR (gLo, gHi) g
1.9.2 The randGenes
function
The randGenes
function uses the randGene
function to generate a list of random genes with length n
:
1-- Create a list with n random genes
2randGenes :: Int -> StdGen -> ([Gene], StdGen)
3randGenes 0 g = ([], g)
4randGenes n g =
5 let (value, g’) = randGene g
6 (restOfList, g’’) = randGenes (n-1) g’
7 in (value:restOfList, g’’)
For convenience, we also implement a version of this function, randGene’
, that does not return
StdGen
:
1-- Create a list with n random genes, do not return StdGen
2randGenes’ :: Int -> StdGen -> [Gene]
3randGenes’ n g = take n $ randomRs (gLo, gHi) g
1.9.3 The randChrom
function
The randChrom
and randChrom’
functions are identical to randGenes
and randGenes’
, respectively,
except that they return a Chromosome
instead of a list of genes [Gene]
:
1-- Random chromosome with n genes
2randChrom :: Int -> StdGen -> (Chromosome, StdGen)
3randChrom n g = randGenes n g
4
5-- Random chromosome with n genes, do not return StdGen
6randChrom’ :: Int -> StdGen -> Chromosome
7randChrom’ n g = randGenes’ n g
1.9.4 The randChroms
function
The randChroms
function uses the randChrom
function to generate a list of cn
random chromosomes,
each with length gn
:
1-- Create a list with cn random chromosomes with gn genes
2randChroms :: Int -> Int -> StdGen -> (Population, StdGen)
3randChroms 0 _ g = ([], g)
4randChroms cn gn g =
5 let (value, g’) = randChrom gn g
6 (restOfList, g’’) = randChroms (cn-1) gn g’
7 in (value:restOfList, g’’)
Again, we can also define a randChrom’
function that does the same thing as randChrom
but does not
return at StdGen
:
1randChroms’ :: Int -> Int -> StdGen -> Population
2randChroms’ cn gn g = chunksOf gn values
3 where values = randGenes’ (cn * gn) g
This function requires the use of the function chunksOf
in the Data.List.Split
library:
1import Data.List.Split (chunksOf)
That is, we first generate a large list of cn
gn
genes, and then we split that list into a list of sublists, each of length gn
.
1.9.5 The randPop
function
The randPop
function is just a special case of the randChroms
function that generates a list of cn =
numPop
random chromosomes, each with gn = numVar
genes:
1-- Random population with numPop chromosomes with numVar genes
2randPop :: StdGen -> (Population, StdGen)
3randPop = randChroms numPop numVar
And again, we can also define a randPop’
function that does not return a StdGen
if we want to:
1-- Random population with numPop chromosomes with numVar genes,
2-- do not return StdGen
3randPop’ :: StdGen -> Population
4randPop’ = randChroms’ numPop numVar
1.9.6 The randIndex
function
Finally, the randIndex
function returns a random index of a list xs
of length n
in the closed interval
:
1-- Random index in a chromosome
2randIndex :: StdGen -> [a] -> (Int, StdGen)
3randIndex g xs = (ind, g’)
4 where (ind, g’) = randomR (0, length xs - 1) g
1.10 Cost functions
Before we proceed with implementing the core of the GA, namely operations such as selection, mating, mutation, and evolution, we need functions to evaluate the cost of chromosomes.
1.10.1 The elemCost
function
We begin with the elemCost
function, which has the type signature elemCost :: (Eq a) => a -> a -> Int
The function compares two elements or items of comparable generic type a
belonging to the
typeclass Eq
and returns a cost of zero if they are equal or a cost of one if they are unequal:
1-- Compare two inputs and return zero if equal, 1 otherwise
2elemCost a b | a == b = 0
3 | otherwise = 1
We can then use elemCost
together with zipWith
to find the accumulated cost of two lists of
elements, e.g., a gene or a string.
Some examples of usage in ghci
include:
1*ContinuousGA> elemCost 5 13
21
3*ContinuousGA> elemCost 1 1
40
5*ContinuousGA> elemCost ’a’ ’b’
61
7*ContinuousGA> elemCost ’a’ ’c’
81
9*ContinuousGA> elemCost ’a’ ’a’
100
1.10.2 The stringCost
function
The stringCost
function is implemented by zipping together two lists of characters (two strings) with
the elemCost
function and summing the result:
1-- Sum the number of unequal chars, char by char, of two strings
2stringCost :: String -> String -> Int
3stringCost s1 s2 = sum $ zipWith elemCost s1 s2
Example usage in ghci
:
1*ContinuousGA> let s1 = ~ac/dc~
2*ContinuousGA> let s2 = ~ac*dc~
3*ContinuousGA> stringCost s1 s2
41
1.10.3 The chromCost
function
The chromCost
function uses the stringCost
function to compare a target string with the string
decoded from a chromosome:
1-- Sum the number of incorrect characters in a chromosome
2chromCost :: String -> Chromosome -> Int
3chromCost target c = stringCost target (decodeChromosome c)
Example usage in ghci
:
1*ContinuousGA> let c = encodeString s2
2*ContinuousGA> chromCost s1 c
31
1.10.4 The chromCostPair
function
As we shall see later, it is convenient to store the evaluated cost of a chromosome together with the chromosome itself in a tuple. For example, if we have a population of such tuples, we can sort it in increasing order of cost (ranking), which is necessary for selection, where we typically want to select better chromosomes before worse ones.
We therefore define a new type
1type ChromCost = (Int, Chromosome)
Next, we define a chromCostPair
function to construct such tuples:
1chromCostPair :: String -> Chromosome -> ChromCost
2chromCostPair target c = (chromCost target c, c)
Note that we use the target string representation as an input to the function instead of its
chromosome representation that we used when we defined the function for the binary
GA.
Using the variables defined above, we can test the function in ghci
:
1*ContinuousGA> chromCostPair s1 c
2(1,[0.0,4.25531914893617e-2,0.9574468085106383,6.382978723404255e-2,4.25531914893617e-2])
The first part of the tuple is the cost of chromosome c
, that is, the number of characters in the
decoded string that c
represents that differ from the target string s1
. The second part of the tuple is the
chromosome c
itself.
1.11 Population functions
We now turn our attention to some functions dealing with an entire population of chromosomes. The
functions below are used to evaluate the cost of each chromosome in a population (evalPop
); to sort a
population in increasing order of cost (sortPop
); to select chromosomes apart from elite
chromosomes to keep for next generation and to use for mating (selection
); get a list of chromosomes
to be used as parents for mating (getParents
); and to convert a list of (cost, chromosome) pairs,
[ChromCost]
, to type Population
.
1.11.1 The evalPop
function
The evalPop
function evaluates all of the chromosomes in a population by comparing the decoded
chromosomes (strings) with a target string and returns list of ChromCost
. This is accomplished by
mapping the partial function chromCostPair target
over the population. This is an example of partial
function application, where we call a function with too few parameters and get back a
partially applied function, that is, a function that takes as many parameters as we left out.
1evalPop :: String -> Population -> [ChromCost]
2evalPop target = map (chromCostPair target)
Example usage in ghci
:
1*ContinuousGA> let target = ~ac/dc~
2*ContinuousGA> let s1 = ~ac*dc~
3*ContinuousGA> let s2 = ~hello~
4*ContinuousGA> let s3 = ~a1234~
5*ContinuousGA> let s4 = ~ac/dc~
6*ContinuousGA> let pop = encodeStringList [s1,s2,s3,s4]
7*ContinuousGA> let popEvaluated = evalPop target pop
8*ContinuousGA> popEvaluated
9[(1,[0.0,4.25531914893617e-2, ...]),
10 (5,[0.14893617021276595,8.51063829787234e-2, ...]),
11 (4,[0.0,0.574468085106383,0.5957446808510638, ...]),
12 (0,[0.0,4.25531914893617e-2,0.9787234042553191, ...])]
1.11.2 The sortPop
function
To sort a population, we can use the sort
function from the Data.List
library. We therefore add sort
to
our import statement:
1import Data.List (elemIndex, sort)
The sortPop
function is then implemented simply as
1sortPop :: [ChromCost] -> [ChromCost]
2sortPop pop = sort pop
Note that this works because when given a list of tuples (ChromCost
is a (cost, chromosome) pair, or
tuple), sort
sorts the list by the first element of the tuples, namely the cost.
Of course, we could have used sort
directly, however, implementing sortPop
ensures correct types so
it adds some safety to our code.
Example usage in ghci
using the evaluated population above:
1*ContinuousGA> sortPop popEvaluated
2[(0,[0.0,4.25531914893617e-2, ... ]),
3 (1,[0.0,4.25531914893617e-2,0.9574468085106383, ...]),
4 (4,[0.0,0.574468085106383,0.5957446808510638, ...]),
5 (5,[0.14893617021276595,8.51063829787234e-2, ...])]
1.11.3 The selection
function
In addition to numElite
elite chromosomes, a number of other chromosomes must be kept in the
population for the next generation and also serve the role as parents for mating. There are several
ways to select such chromosomes, including roulette wheel selection, tournament selection, random
selection, etc. Here, we just take a fraction xRate
(called the selection rate) of a sorted
(ranked) population. The total number of elite chromosomes plus selected chromosomes
should equal numKeep
. A possible implementation of a selection
function is given below:
1selection :: [ChromCost] -> [ChromCost]
2selection pop = take (numKeep - numElite) $ drop numElite pop
1.11.4 The getParents
function
The getParents
function returns the chromosomes to keep in the population for the next generation
and to serve as parents. These chromosomes consists of the numElite
best chromosomes in the
population plus some chromosomes that are selected using the selection
function above.
1getParents :: [ChromCost] -> [ChromCost]
2getParents pop = (take numElite pop) ++ selection pop
1.11.5 The toPopulation
function
For some of the genetic operations dealing with the population, it may be convenient not to have to
handle a list of (cost, chromosome) pairs but just a list of chromosomes. We therefore define
a function toPopulation
to perform a conversion of a list of ChromCost
to Population
:
1toPopulation :: [ChromCost] -> Population
2toPopulation [] = []
3toPopulation ((cost,chrom) : pop) = chrom : toPopulation pop
1.11.6 Testing the population functions
We can test the population functions in ghci
. First, we set the GA parameters in the module that will
affect the functions:
1numPop = 8 :: Int -- Population size (number of chromosomes)
2xRate = 0.5 :: Double -- Selection rate
3numElite = 2 :: Int -- Number of elite chromosomes
That is, we will play with a population of numPop == 8
chromosomes, keep numKeep == 4
chromosomes, of which numElite == 2
shall be elite chromosomes.
In ghci
, we do the following:
1*ContinuousGA> let stringList = [~cde~,~klm~,~123~,~*/
2~,~bbb~,~abb~,~aab~,~aah~]
3*ContinuousGA> let targetString = ~aah~
4*ContinuousGA> let pop = encodeStringList stringList
5*ContinuousGA> let sortedPop = sortPop $ evalPop targetString pop
6*ContinuousGA> sortedPop
7[(0,[0.0,0.0,0.14893617021276595]),
8 (1,[0.0,0.0,2.127659574468085e-2]),
9 (2,[0.0,2.127659574468085e-2,2.127659574468085e-2]),
10 (3,[2.127659574468085e-2,2.127659574468085e-2,2.127659574468085e-2]),
11 (3,[4.25531914893617e-2,6.382978723404255e-2,8.51063829787234e-2]),
12 (3,[0.2127659574468085,0.23404255319148937,0.2553191489361702]),
13 (3,[0.574468085106383,0.5957446808510638,0.6170212765957447]),
14 (3,[0.9574468085106383,0.9787234042553191,1.0])]
15*ContinuousGA> let selectedChroms = selection sortedPop
16*ContinuousGA> selectedChroms
17[(2,[0.0,2.127659574468085e-2,2.127659574468085e-2]),
18 (3,[2.127659574468085e-2,2.127659574468085e-2,2.127659574468085e-2])]
That is, the selectedChroms
are the third and fourth chromosome in the sorted population, because
the selection
function ignores the first numElite == 2
chromosomes and takes the next two
chromosomes so that the total is equal to numKeep == 4
.
Next, we can use the getParents
function to complete the selection process, leaving us with the top
four chromosomes:
1*ContinuousGA> let parents = getParents sortedPop
2*ContinuousGA> parents
3[(0,[0.0,0.0,0.14893617021276595]),
4 (1,[0.0,0.0,2.127659574468085e-2]),
5 (2,[0.0,2.127659574468085e-2,2.127659574468085e-2]),
6 (3,[2.127659574468085e-2,2.127659574468085e-2,2.127659574468085e-2])]
If we are curious which strings these chromosomes correspond to, we can use the function
toPopulation
to convert from a [ChromCost
] list to a Population
, and then decode the population:
1*ContinuousGA> let parentsAsPop = toPopulation parents
2*ContinuousGA> decodePopulation parentsAsPop
3[~aah~,~aab~,~abb~,~bbb~]
1.12 Mating functions
We are finally ready to begin implementing the core components of the GA, namely mating,
mutation, and evolution. We begin with two function required for mating, the single point crossover
function and the matePairwise
function.
1.12.1 The crossover
function
The function crossover
is used for mating with single point crossover:
1crossover :: Int -> Chromosome -> Chromosome -> Population
2crossover cp ma pa = [take cp ma ++ drop cp pa, take cp pa ++ drop cp ma]
Given a crossover point cp
, a mother chromosome ma
, and a father chromosome pa
, it returns two
offspring, where one consists of the genes in ma
and pa
before and after cp
, respectively, and the
other consists of the remaining genes from ma
and pa
.
Example usage in ghci
:
1*ContinuousGA> let s1 = ~abcd~
2*ContinuousGA> let s2 = ~efgh~
3*ContinuousGA> let c1 = encodeString s1
4*ContinuousGA> let c2 = encodeString s2
5*ContinuousGA> c1
6[0.0,2.127659574468085e-2,4.25531914893617e-2,6.382978723404255e-2]
7*ContinuousGA> c2
8[8.51063829787234e-2,0.10638297872340426,0.1276595744680851,0.14893617021276595]
9*ContinuousGA> crossover 0 c1 c2
10[[8.51063829787234e-2,0.10638297872340426,0.1276595744680851,0.14893617021276595],
11 [0.0,2.127659574468085e-2,4.25531914893617e-2,6.382978723404255e-2]]
12*ContinuousGA> crossover 1 c1 c2
13[[0.0,0.10638297872340426,0.1276595744680851,0.14893617021276595],
14 [8.51063829787234e-2,2.127659574468085e-2,4.25531914893617e-2,6.382978723404255e-2]]
15*ContinuousGA> crossover 2 c1 c2
16[[0.0,2.127659574468085e-2,0.1276595744680851,0.14893617021276595],
17 [8.51063829787234e-2,0.10638297872340426,4.25531914893617e-2,6.382978723404255e-2]]
18*ContinuousGA> crossover 3 c1 c2
19[[0.0,2.127659574468085e-2,4.25531914893617e-2,0.14893617021276595],
20 [8.51063829787234e-2,0.10638297872340426,0.1276595744680851,6.382978723404255e-2]]
21*ContinuousGA> crossover 4 c1 c2
22[[0.0,2.127659574468085e-2,4.25531914893617e-2,6.382978723404255e-2],
23 [8.51063829787234e-2,0.10638297872340426,0.1276595744680851,0.14893617021276595]]
We observe that for cp == 0
, we just clone the parents, as also happens for cp
greater than 4.
Otherwise, for cp
equal to 1, 2, or 3, the offspring is created by a crossover point after gene number 1,
2, or 3, respectively.
1.12.2 The matePairwise
function
There are many ways to choose which chromosomes should mate to create offspring,
e.g., we could randomly draw a mother and a father chromosome from a subpopulation
consisting of the numKeep
best chromosomes in a population. Here, we simply use pairwise
mating in a recursive function called matePairwise
, where chromosomes 1 and 2 mate,
chromosomes 3 and 4 mate, and so forth. The resulting offspring is returned as a Population
together with a StdGen
, both in a tuple. The function requires several standard PRNGs. For
this, we add the split
function that comes with the System.Random
library in our import:
1import System.Random (StdGen, randomR, split, mkStdGen)
The implementation of matePairwise
is given below:
1matePairwise :: StdGen -> Population -> (Population, StdGen)
2matePairwise g [] = ([], g)
3matePairwise g [ma] = ([ma], g)
4matePairwise g (ma:pa:cs) = (offspring ++ fst (matePairwise g’ cs), g’’)
5 where (g’, g’’) = split g
6 (g’’’, _) = split g’’
7 cp = fst $ randIndex g’’’ ma
8 offspring = crossover cp ma pa
The two base cases say that an empty population should just return the empty list, and if the
population only have a single chromosome, we should just clone it. The general case
generates a random crossover point cp
using the randIndex
function and then calls the
single point crossover
function with cp
and the first two chromosomes in the population
to create two offspring. It then recursively repeats the process on the remainder of the
population.
Example usage in ghci
:
1*ContinuousGA> let stringList =
2[~abcd~,~efgh~,~ijkl~,~mnop~,~qrst~,~uvwx~,~yz12~,~3456~]
3*ContinuousGA> let pop = encodeStringList stringList
4*ContinuousGA> pop
5[[0.0,2.127659574468085e-2,4.25531914893617e-2,6.382978723404255e-2],
6 [8.51063829787234e-2,0.10638297872340426,0.1276595744680851,0.14893617021276595],
7 [0.1702127659574468,0.19148936170212766,0.2127659574468085,0.23404255319148937],
8 [0.2553191489361702,0.2765957446808511,0.2978723404255319,0.3191489361702128],
9 [0.3404255319148936,0.3617021276595745,0.3829787234042553,0.40425531914893614],
10 [0.425531914893617,0.44680851063829785,0.46808510638297873,0.48936170212765956],
11 [0.5106382978723404,0.5319148936170213,0.574468085106383,0.5957446808510638],
12 [0.6170212765957447,0.6382978723404256,0.6595744680851063,0.6808510638297872]]
13*ContinuousGA> let (newPop, g’) = matePairwise (mkStdGen 99) pop
14*ContinuousGA> newPop
15[[0.0,2.127659574468085e-2,4.25531914893617e-2,0.14893617021276595],
16 [8.51063829787234e-2,0.10638297872340426,0.1276595744680851,6.382978723404255e-2],
17 [0.2553191489361702,0.2765957446808511,0.2978723404255319,0.3191489361702128],
18 [0.1702127659574468,0.19148936170212766,0.2127659574468085,0.23404255319148937],
19 [0.3404255319148936,0.44680851063829785,0.46808510638297873,0.48936170212765956],
20 [0.425531914893617,0.3617021276595745,0.3829787234042553,0.40425531914893614],
21 [0.5106382978723404,0.5319148936170213,0.6595744680851063,0.6808510638297872],
22 [0.6170212765957447,0.6382978723404256,0.574468085106383,0.5957446808510638]]
23*ContinuousGA> decodePopulation newPop
24[~abch~,~efgd~,~mnop~,~ijkl~,~qvwx~,~urst~,~yz56~,~3412~
We observe that the crossover point for chromosomes 1 and 2 in pop
was after the third
gene; for chromosomes 3 and 4 it was after the fourth gene (cloning); for chromosomes 5
and 6 it was after the first gene; and for chromosomes 7 and 8 it was after the second
gene.
Note that if we do not want to allow cloning, we could limit the randIndex
function to always return a
random index smaller than the lenght of the chromosome.
1.13 Mutation functions
Mutation involves changing a gene value to a new random value limited to
and
. The
fraction (mutation rate) of chromosomes in the population that should mutate is called mutRate
, and
corresponds to the integer numMut
. For example, for a population size of numPop == 100
and a
mutation rate of mutRate == 0.1
, the number of chromosomes to mutate would be numMut ==
10
.
The necessary mutation functions are given below.
1.13.1 The replaceAtIndex
function
If we have a gene variable, or a list of genes (a chromosome), we cannot just change (overwrite) it as
we likely would do in an imperative language, since data variables in a purely functional language
like Haskell are immutable (they cannot change once defined). Instead, we must copy the data we
need and put it together in a new data structure or variable. We therefore implement a helper function
that can replace an item in a list at a particular index, namely the replaceAtIndex
function below:
1replaceAtIndex :: Int -> a -> [a] -> [a]
2replaceAtIndex n item ls = as ++ (item:bs) where (as, (b:bs)) = splitAt n ls
This function uses the splitAt
function available in Prelude (so no need to import it) to split the list ls
at the position n
into two sublists as
and (b:bs)
, where as
are the first n
elements in ls
, and (b:bs)
are
the remainding elements. The element b
is then replaced with item
and the pieces are put together
again using the ++
function.
Example usage in ghci
:
1*ContinuousGA> replaceAtIndex 5 99 [0,1,2,3,4,5,6,7,8,9]
2[0,1,2,3,4,99,6,7,8,9]
1.13.2 The mutateChrom
function
The mutateChrom
function mutates a randomly selected gene among its list of genes:
1mutateChrom :: StdGen -> Chromosome -> (Chromosome, StdGen)
2mutateChrom g chrom = (mutChrom, g2’)
3 where (g1, g2) = split g
4 (nGene, g1’) = randIndex g1 chrom
5 (mutGene, g2’) = randGene g2
6 mutChrom = replaceAtIndex nGene mutGene chrom
An index nGene
in the chromosome chrom
is picked randomly. A new mutated gene mutGene
at this index is then mutated using the randGene
function. Finally, a new chromosome
mutChrom
that is identical to chrom
except that the gene at index nGene
has been mutated is
returned.
Example usage in ghci
:
1*ContinuousGA> let s = ~hello world!~
2*ContinuousGA> let c = encodeString s
3*ContinuousGA> let (cMut,g’) = mutateChrom (mkStdGen 99) c
4*ContinuousGA> decodeChromosome cMut
5~hello worldl~
6*ContinuousGA> let (cMut,g’) = mutateChrom (mkStdGen 98) c
7*ContinuousGA> decodeChromosome cMut
8~hello/world!~
Using a PRNG obtained from mkdStdGen 99
, the last gene is mutated, that is, a randomly generated
gene corresponding to the character ’l’ replaces the letter ’!’, whereas using a PRNG obtained from
mkdStdGen 98
, the sixth gene is mutated, that is, a randomly generated gene corresponding to the
character ’/’ replaces the space character ’ ’.
1.13.3 The mutateChromInPop
function
The function mutateChromInPop
mutates a chromosome at index n
in population pop
:
1mutateChromInPop :: StdGen -> Int -> Population -> (Population, StdGen)
2mutateChromInPop g n pop = (replaceAtIndex n mutChrom pop, g’)
3 where (g’, g’’) = split g
4 (mutChrom, _) = mutateChrom g’’ (pop!!n)
1.13.4 The mutIndices
and mutatePop
functions
Finally, we need a function called mutatePop
that given a list of indices, mutates all its chromosomes
at those indices.
To randomly generate a list of indices, we create a function mutIndices
:
1mutIndices :: Population -> StdGen -> [Int]
2mutIndices pop g = take numMut $ randomRs (numElite, length pop - 1) g
The function makes use of the randomRs
function from the System.Random
library, hence we add it
to our import statement:
1import System.Random (StdGen, randomR, randomRs, split, mkStdGen)
The randomRs
function produces an infinite list of random indices limited to lower and upper
bounds given by it first argument. Here, the lower bound is numElite
, because we do not
want to mutate any of the elite chromosomes, and the upper bound is the index of the last
chromosome in the population. Finally, we take only the first numMut
indices from the infinite
list.
Now that we have a a function to generate a list of random indices for the chromosomes that shall be
mutated, we can implement a recursive mutatePop
function:
1mutatePop :: StdGen -> [Int] -> Population -> (Population, StdGen)
2mutatePop g _ [] = ([], g)
3mutatePop g [] pop = (pop, g)
4mutatePop g (n:ns) pop = mutatePop g’ ns pop’
5 where (pop’, g’) = mutateChromInPop g
The first base case says to do nothing if the population is empty or the list of indices is empty. Given
a list (n:ns)
of indices, the recursive case uses the mutateChromInPop
function defined above to
mutate the chromosome at index n
in the population before it recursively continues with
the remaining ns
indices. A list of random indices can be provided by the mutIndices
function.
Example usage in ghci
:
1*ContinuousGA> decodePopulation pop
2[~abcd~,~efgh~,~ijkl~,~mnop~,~qrst~,~uvwx~,~yz12~,~3456~]
3*ContinuousGA> numMut
44
5*ContinuousGA> let mutIdx = mutIndices pop (mkStdGen 99)
6*ContinuousGA> mutIdx
7[7,5,6,3]
8*ContinuousGA> let (mutPop, g’) = mutatePop (mkStdGen 99) mutIdx pop
9*ContinuousGA> decodePopulation mutPop
10[~abcd~,~efgh~,~ijkl~,~miop~,~qrst~,~u5wx~,~yz1c~,~o456~]
Because numMut == 4
, four random chromosomes in pop
are mutated. The indices of these four
chromosomes is determined by calling the mutIndices
function, which returns mutIdx
= [7,5,6,3]
. By comparing the mutated population mutPop
with the original population
pop
, we observe that for the chromosome at index 3, the second gene was mutated; for
the chromosome at index 5, the second gene was mutated; for the chromosome at index
6, the fourth gene was mutated; and for the chromosome at index 7, the first gene was
mutated.
1.14 Evolution functions
Our GA is almost finished but the most important step is left: evolution. We need two functions,
evolvePopOnce
and evolvePop
, to complete the GA.
1.14.1 The evolvePopOnce
function
The evolvePopOnce
function evolves a population from one generation to the next:
1evolvePopOnce :: StdGen -> Population -> (Population, StdGen)
2evolvePopOnce g pop = (newPopMutated, g4)
3 where (g’, g’’) = split g
4 ePop = evalPop target pop
5 sPop = sortPop ePop
6 parents = toPopulation $ getParents sPop
7 (offspring, g3) = matePairwise g’ parents
8 newPop = parents ++ offspring
9 mutIdx = mutIndices newPop g3
10 (newPopMutated, g4) = mutatePop g’’ mutIdx newPop
Its input is a population pop
and the output is a new population newPopMutated
that has been
constructed through genetic operations. Each of the chromosomes in pop
is evaluated with respect to
the target string target
(hardcoded at the top of the module) and sorted according to their associated
cost. Parents to be kept for the next generation and for generating offspring are selected using
the getParents
function, and to convert from a list of (cost, chromosome) pairs, we use
the toParents
function. The result is assigned to parents
. The function matePairwise
then
create offspring
using single point crossover on the chromosomes in parents
. The parents
and the offspring
collectively become the new population newPop
. Finally, a number
numMut
of the chromosomes in newPop
are mutated and the results is the next generation
newPopMutated
.
Here is an example usage in ghci
for a target=s2
string, where s2="abc"
, and the following GA
settings:
1numPop = 8 :: Int -- Population size (number of chromosomes)
2xRate = 0.5 :: Double -- Selection rate
3mutRate = 0.5 :: Double -- Mutation rate
4numElite = 2 :: Int -- Number of elite chromosomes
1*ContinuousGA> let stringList =
2[~xxx~,~def~,~ghj~,~klm~,~nop~,~qrs~,~tuv~,~wxy~]
3*ContinuousGA> let pop = encodeStringList stringList
4*ContinuousGA> let (popEvolvedOnce,g’) = evolvePopOnce (mkStdGen 99) pop
5*ContinuousGA> decodePopulation popEvolvedOnce
6[~def~,~ghj~,~wum~,~nwp~,~9hj~,~gef~,~kop~,~nlm~]
1.14.2 The evolvePop
function
Finally, we create the evolvePop
function. This function evolves a population n
times recursively,
making use of the evolvePopOnce
function just described.
1evolvePop :: StdGen -> Int -> Population -> (Population, StdGen)
2evolvePop g 0 pop = (newpop, g)
3 where newpop = toPopulation $ sortPop $ evalPop target pop
4evolvePop g n pop = evolvePop g’ (n-1) newPop
5 where (newPop, g’) = evolvePopOnce g pop
Example usage in ghci
:
1*ContinuousGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 10 pop
2*ContinuousGA> decodePopulation popEvolvedNTimes
3[~dcf~,~dcf~,~dcf~,~dcf~,~dc1~,~dc1~,~x_1~,~5cf~]
4*ContinuousGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 20 pop
5*ContinuousGA> decodePopulation popEvolvedNTimes
6[~dcf~,~dcf~,~dcf~,~dcf~,~dcf~,~d6z~,~d+f~,~ucf~]
7*ContinuousGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 100 pop
8*ContinuousGA> decodePopulation popEvolvedNTimes
9[~cbc~,~cbc~,~cbc~,~cbc~,~cbc~,~cbc~,~cb:~,~g2c~]
10*ContinuousGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 200 pop
11*ContinuousGA> decodePopulation popEvolvedNTimes
12[~abc~,~abc~,~abc~,~abc~,~ab.~,~apc~,~9bc~,~.bc~]
We observe that neither 10, 20, nor 100 generations were sufficient to find (learn) the target string,
whereas for 200 generations, the population evolved and the best chromosome was identical to the
target, decoded as "abc"
. The binary GA studied in the previous tutorial, only needed 20 generations
for the same GA settings.
1.15 Final remarks
Congratulations! You should now have a working GA contained in your ContinuousGA
module.
Even if you are generous with comments and line shifts (as I am), your module should be less than
300 lines.
What remains is to test the GA. We will investigate this in the next section.