Functional Programming and Intelligent Algorithms

# Tutorial: Binary GA

## Week 12: The binary GA

### 2 Tutorial: Binary GA

To get you started, we provide a step-by-step tutorial to the problem of implementing a binary GA for string learning.

The purpose of this GA is to discover, or learn, a target string. To learn the target string, the GA will contain a population of candidate solutions (chromosomes) that represent strings. Each chromosome is a guess and can be evaluated and assigned a cost, where the cost should relate to the difference between the string it represents and the target string.

The GA can then evolve the population over several generations until one or more of the individuals in the population become the target string. Such individuals will have a cost of zero.

Note that this example is adapted from (and is slightly easier than) the song learning example in Chapter 4 of Haupt & Haupt, where a GA is used to learn the song “Mary had a little lamb.”

We begin by creating a module BinaryGA.hs for our code:

1module BinaryGA where 

#### 2.1 GA settings

Before we proceed, it is useful to add some GA parameter settings to the top of our file so that they can easily be found and tweaked when we will test the GA later. We adopt the terminology used in class and in Haupt & Haupt and define the following parameters with some default settings:

1numPop = 40   :: Int    -- Population size (number of chromosomes) 2xRate = 0.5   :: Double -- Selection rate 3mutRate = 0.1 :: Double -- Mutation rate 4numElite = 1  :: Int    -- Number of elite chromosomes 5itMax = 1000  :: Int    -- Max number of iterations 

#### 2.2 Alphabet

The alphabet is the set of possible symbols, or characters, that can constitute a string. We store the alphabet as a string with 48 characters

1alphabet’ = ~abcdefghijklmnopqrstuvwxyz0123456789.,;:?!_+-*/ ~ :: String Please note that the final character ~~ in alphabet’ is the space character and must not be removed. We observe that to encode 48 characters, we need 6 bits, since 5 bits would only encode ${2}^{5}=32$ characters, whereas 6 bits can encode ${2}^{6}=64$ characters.

To avoid complications in the operations of our GA, we pad our alphabet with 16 more characters, e.g., we just add the letter ’a’ 16 times:

1alphabet = alphabet’ ++ ~aaaaaaaaaaaaaaaa~ :: String -- need 64 characters 

#### 2.3 Target strings

For testing and debugging purposes, we define a set of test strings:

1s1, s2, s3, s4, s5, s6 :: String 2s1 = ~h~ 3s2 = ~abc~ 4s3 = ~hello world!~ 5s4 = ~abc123.,;~ 6s5 = ~descartes: cogito ergo sum~ 7s6 = ~2+2 is: 4, 2*2 is: 4; why is _/not/_ 2.2*2.2 equal to 4.4?!~ 8target = s2 

#### 2.4 Bits, genes, chromosomes, and the population

In the following, we will assume that a gene consists of a list of bits and encodes a single character, and that a chromosome is a list of genes that encodes a string.

For generality and to make our code compatible with other alphabets than the one given here, we should explicitly calculate the number of bits, numGene, required for each gene to encode all the characters in the alphabet. For this, we can used the log2 function that comes with the Numeric.SpecFunctions library:

1import Numeric.SpecFunctions (log2) -- may require cabal install math-functions Note that you may have to run cabal install math-functions to install this library.

Unfortunately, log2 rounds downwards (floor) whereas we require rounding upwards (ceiling)! Simply adding 1 to the result seems like an intuitive solution but would fail when the length of alphabet is exactly a power of 2 (that is, 2, 4, 8, 16, 32, 64, 128, etc.). For those cases, log2 would return the correct answer and adding 1 would be one too many.

One possible solution is to use the fact that $⌈{log}_{2}n⌉=⌊{log}_{2}\left(2n-1\right)⌋$. We can then determine the required number of genes in our chromosomes like this:

1numGene = log2 $2*(length alphabet) - 1 :: Int If we load our module into ghci, we can verify that for the 64-character alphabet above, numGene = 6 bits. Moreover, the number of characters in the target string is sometimes called the number of (encoding) variables, numVar: 1numVar = length target :: Int -- Number of characters (genes) in target This number tells us how many genes are required in each chromosome. If we are interested in the total number of bits in a chromosome, numBits, we simply multiply the number of bits in each gene numGene with the total number of variables numVar: 1numBits = numGene * numVar :: Int -- Number of bits required in chromosomes  Next, we need to know the number of chromosomes numKeep to keep between generations. This number is a fraction (the selection rate xRate) of the population size numPop and should be rounded up to the nearest even number to simplify the mating procedure: 1numKeep | numKeep’’ > numPop = numPop - numPop ‘mod‘ 2 2 | numKeep’’ < 2 = 2 3 | otherwise = numKeep’’ 4 where numKeep’’ = numKeep’ + numKeep’ ‘mod‘ 2 5 numKeep’ = ceiling$ xRate * fromIntegral numPop :: Int For example, for numPop = 40 and xRate = 0.5, we get numKeep = 20. Some cases are introduced for safety to ensure numKeep is always greater than or equal to 2, and always smaller than or equal to numPop.

We also need to determine the number of chromosomes to mutate, numMut, which is a fraction (the mutation rate mutRate) of the population size, and must be rounded up to nearest integer:

1numMut = ceiling $mutRate * fromIntegral numPop :: Int For numPop = 40 and mutRate = 0.1, we get numMut = 4. Finally, we define some type aliases for readability and debugging purposes: 1type Bit = Int 2type Gene = [Bit] 3type Chromosome = [Gene] 4type Population = [Chromosome] First of all, note that we possibly could have used a binary type (e.g., true or false such as the type Bool, or a library such as BitString) for the Bit and Gene, however, this may have complicated other parts of our code. Exercise: Experiment with different alphabets and different values for numPop, xRate, and mutRate and verify that numGene, numKeep, and numMut are calculated correctly by testing in the ghci. #### 2.5 Encoding functions We need a function encodeString to encode a string as a Chromosome. This means that if we have another function encodeChar that can encode a single character as a Gene, we can just map that function over a string, which is a list of characters, in order to obtain the encoded Chromosome. In addition, it will likely prove useful to have a function encodeStringList that can encode a list of strings as a Population. ##### 2.5.1 Encoding/decoding scheme First, we need to decide on an encoding/decoding scheme to convert between characters in alphabet and binary and vice versa. One possible scheme is given in the Table 1.  binary decimal character 000000 0 a 000001 1 b 000010 2 c ⋮ ⋮ ⋮ 101110 46 / 101111 47 ␣ 110000 48 a 110001 49 a ⋮ ⋮ ⋮ 111111 63 a Table 1: Encoding/decoding scheme. ##### 2.5.2 The encodeChar function Let us begin with defining the encodechar function: 1encodeChar :: Char -> Gene 2encodeChar = map digitToInt . encodeChar’  Given a character c, the function should return a list of numGene = 6 bits, that is, a gene. In the encoding/decoding scheme in Table 1, each character in alphabet is encoded as the binary number of length numGene that corresponds to its index (position in alphabet). Therefore, the letter ’a’ is at index 0 and is encoded as 000000, the letter ’b’ is at index 1 and is encoded as 000001, and so forth. A convenient function to determine the index of an element in a list is elemIndex, which is part of the Data.List library: 1import Data.List (elemIndex)  Its type signature is given by elemIndex :: Eq a => a -> [a] -> Maybe Int. Therefore, we will need to “strip” the Just from the return value but for now, let us just assume that we have obtained an integer that corresponds to the index of a character in alphabet. This integer must then be converted to a 6-bit binary number. For this, we can use the printf function from the Text.Printf library: 1import Text.Printf (printf)  The hackage documentation for printf shows that we can use the flags %06b to format a decimal number as binary number of length 6 and padded with zeros if necessary to obtain the correct length. This binary number will be returned as a string. Putting it together, we then have a function encodeChar’ that can encode a character in alphabet as a binary string of length numGene = 6: 1encodeChar’ :: Char -> String 2encodeChar’ c = printf ~%06b~$ stripMaybe index 3    where index = elemIndex c alphabet 4          stripMaybe (Just index) = index 5          stripMaybe (Nothing) = error $~elemIndex returned Nothing. ~ ++ 6 ~A character is not in the alphabet!~ The function defines a helper function stripMaybe to remove the Just from the index. We can test this function in the interpreter: 1*BinaryGA> encodeChar’ ’a’ 2~000000~ 3*BinaryGA> encodeChar’ ’b’ 4~000001~ 5*BinaryGA> encodeChar’ ’/’ 6~101110~  The final thing to do is to convert binary strings like these to our Gene type. For this, we can use the digitToInt function readily available from the Data.Char library: 1import Data.Char (digitToInt) The digitToInt function converts a single digit Char to the corresponding Int. Thus, our encodeChar function becomes 1encodeChar :: Char -> Gene 2encodeChar = map digitToInt . encodeChar’ Again, we should test the function in ghci: 1*BinaryGA> encodeChar ’a’ 2[0,0,0,0,0,0] 3*BinaryGA> encodeChar ’b’ 4[0,0,0,0,0,1] 5*BinaryGA> encodeChar ’/’ 6[1,0,1,1,1,0]  ##### 2.5.3 The encodeString function It was hard work to define the encodeChar function above! Luckily, the hard work pays off when defining the encodeString function: 1encodeString :: String -> Chromosome 2encodeString = map encodeChar That’s it! We just map encodechar over a list of characters (a string) to obtain a list of encoded genes, or a Chromosome. We can test the function in ghci: 1*BinaryGA> encodeString ~abc~ 2[[0,0,0,0,0,0],[0,0,0,0,0,1],[0,0,0,0,1,0]] 3*BinaryGA> encodeString ~*/ ~ 4[[1,0,1,1,0,1],[1,0,1,1,1,0],[1,0,1,1,1,1]]  ##### 2.5.4 The encodeStringList function The encodeStringList function is just as simple: 1encodeStringList :: [String] -> Population 2encodeStringList = map encodeString The output in ghci for a list of two test strings yields 1*BinaryGA> encodeStringList [~ab~,~*/~] 2[[[0,0,0,0,0,0],[0,0,0,0,0,1]],[[1,0,1,1,0,1],[1,0,1,1,1,0]]]  Exercise: What happens if one of the encoding functions above encounter a character not in alphabet? #### 2.6 Decoding functions In addition to encoding functions, we also need decoding functions able to convert back from binary to characters in alphabet. ##### 2.6.1 The bitsToInt function Given a list of bits, we can first convert this to a decimal that corresponds to the index in alphabet, and then access the element in alphabet at that particular index. Thus, we first create a straightforward recursive bitsToInt function: 1bitsToInt :: [Int] -> Int 2bitsToInt [] = 0 3bitsToInt (b:bs) = 2^n*b + bitsToInt bs where n = length (b:bs) - 1  Example usage in ghci: 1*BinaryGA> bitsToInt [1,0,1] 25 3*BinaryGA> bitsToInt [0,0,0,1,0,1] 45  Exercise: Make sure you understand how bitsToInt works. If necessary, step through the recursive function steps on a piece of paper for some example binary numbers. What is the result of bitsToInt [1,2,3,4]? ##### 2.6.2 The decodeGene function Next, we define a decodeGene function that converts a Gene to an index (a decimal number) using (bitsToInt g) and then accesses the character at the index position in alphabet using the !! function: 1decodeGene :: Gene -> Char 2decodeGene g = alphabet!!(bitsToInt g)  Example usage in ghci: 1*BinaryGA> decodeGene [0,0,0,0,0,0] 2’a’ 3*BinaryGA> decodeGene [0,0,0,0,1,0] 4’c’ 5*BinaryGA> decodeGene [0,0,1,0,1,0] 6’k’ 7*BinaryGA> decodeGene [1,0,1,1,1,0] 8’/’  ##### 2.6.3 The decodeChromosome function To decode a Chromosome, we just map the decodeGene function over it to obtain the corresponding string of characters: 1decodeChromosome :: Chromosome -> String 2decodeChromosome c = map decodeGene c  Here is an example in ghci: 1*BinaryGA> let band = encodeString ~ac/dc~ 2*BinaryGA> band 3[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,1,0],[0,0,0,0,1,1],[0,0,0,0,1,0]] 4*BinaryGA> decodeChromosome band 5~ac/dc~  ##### 2.6.4 The decodePopulation function It is also convenient to have a function able to decode an entire Population of chrosomoses to a list of decoded strings. For this, we just map the decodeChromosome function over the population list of chromosomes: 1decodePopulation :: Population -> [String] 2decodePopulation pop = map decodeChromosome pop  Example functionality in ghci: 1*BinaryGA> let pop = encodeStringList [~ac/dc~,~heavy~,~rock!~] 2*BinaryGA> pop 3[[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,1,0],[0,0,0,0,1,1],[0,0,0,0,1,0]], 4 [[0,0,0,1,1,1],[0,0,0,1,0,0],[0,0,0,0,0,0],[0,1,0,1,0,1],[0,1,1,0,0,0]], 5 [[0,1,0,0,0,1],[0,0,1,1,1,0],[0,0,0,0,1,0],[0,0,1,0,1,0],[1,0,1,0,0,1]]] 6*BinaryGA> decodePopulation pop 7[~ac/dc~,~heavy~,~rock!~]  #### 2.7 Randomness functions Randomness is vital for a GA, e.g., to create a random initial population, a random crossover point, a random mutation, and so on. We will use two functions from the System.Random library to implement the functions we need for randomness, namely StdGen and randomR: 1import System.Random (StdGen, randomR) It will be convenient to have functions for generating lists of random bits in order to construct random genes, chromosomes, and populations; and also for generating a random index (e.g., a crossover point) in a list. Note that the reason we also return a StdGen in many or most of the functions dealing with randomness is so that we can repeatedly apply them, e.g., in a genetic evolution. In such cases, we need to pass on a StdGen for the next function calls. ##### 2.7.1 The randBits function The randBits function generates a list of random bits of length n: 1randBits :: Int -> StdGen -> ([Bit], StdGen) 2randBits 0 g = ([], g) 3randBits n g = 4 let (value, g’) = randomR (0,1) g 5 (restOfList, g’’) = randBits (n-1) g’ 6 in (value:restOfList, g’’) It uses the randomR function to generate a random integer value in the closed interval $\left[0,1\right]$ and also returns a new PRNG g’. The new PRNG is then used in a recursive call to randBits that fills up a list of random Bit until is has length n. ##### 2.7.2 The randGene function The randGene function is just a special case of the randBits function that generates a list of random bits with the same length as the required for the genes, or n == numGene: 1randGene :: StdGen -> (Gene, StdGen) 2randGene g = randBits numGene g  ##### 2.7.3 The randGenes function The randGenes function uses the randGene function to generate a list of random genes with length n: 1randGenes :: Int -> StdGen -> ([Gene], StdGen) 2randGenes 0 g = ([], g) 3randGenes n g = 4 let (value, g’) = randGene g 5 (restOfList, g’’) = randGenes (n-1) g’ 6 in (value:restOfList, g’’)  ##### 2.7.4 The randChrom function The randChrom function is just a special case of the randGenes function that generates a list of random genes with the required number of genes for a chromosome, or n == numVar: 1randChrom :: StdGen -> (Chromosome, StdGen) 2randChrom g = randGenes numVar g  ##### 2.7.5 The randChroms function The randChroms function uses the randChrom function to generate a list of random chromosomes with length n: 1randChroms :: Int -> StdGen -> (Population, StdGen) 2randChroms 0 g = ([], g) 3randChroms n g = 4 let (value, g’) = randChrom g 5 (restOfList, g’’) = randChroms (n-1) g’ 6 in (value:restOfList, g’’)  ##### 2.7.6 The randPop function The randPop function is just a special case of the randChroms function that generates a list of random chromosomes with the required number of chromosomes for a population, or n == numPop: 1randPop :: StdGen -> (Population, StdGen) 2randPop g = randChroms numPop g  ##### 2.7.7 The randIndex function The randIndex function returns a random index of a list of length n in the closed interval $\left[0,n-1\right]$: 1randIndex :: StdGen -> [a] -> (Int, StdGen) 2randIndex g xs = (ind, g’) 3 where (ind, g’) = randomR (0, length xs - 1) g  #### 2.8 Cost functions Before we proceed with implementing the core of the GA, namely operations such as selection, mating, mutation, and evolution, we need functions to evaluate the cost of chromosomes. ##### 2.8.1 The elemCost function We begin with the elemCost function, which has the type signature elemCost :: (Eq a) => a -> a -> Int The function compares two elements or items of generic type a and returns a cost of zero if they are equal or a cost of one if they are unequal: 1elemCost a b | a == b = 0 2 | otherwise = 1 We can then use elemCost together with zipWith to find the accumulated cost of two lists of elements, e.g., a gene or a string. Some examples of usage in ghci include: 1*BinaryGA> elemCost 5 13 21 3*BinaryGA> elemCost 5 5 40 5*BinaryGA> elemCost ’a’ ’b’ 61 7*BinaryGA> elemCost ’a’ ’c’ 81 9*BinaryGA> elemCost ’a’ ’a’ 100  ##### 2.8.2 The geneCost function The geneCost function is implemented by zipping together to lists of bits (genes) with the elemCost function and summing the result: 1geneCost :: Gene -> Gene -> Int 2geneCost g1 g2 = sum$ zipWith (elemCost) g1 g2 

Example usage in ghci:

1*BinaryGA> let g1 = encodeChar ’a’ 2*BinaryGA> g1 3[0,0,0,0,0,0] 4*BinaryGA> let g2 = encodeChar ’h’ 5*BinaryGA> g2 6[0,0,0,1,1,1] 7*BinaryGA> geneCost g1 g2 83 

##### 2.8.3 The stringCost and chromCost functions

The stringCost function compares two strings, character by character, and sums the number of unequal characters. Strictly speaking, it is not needed for the GA to work, but firstly, it demonstrates how we can use the generic elemCost function on different types of lists, and secondly, it may prove convenient, for example in a debugging phase.

The stringCost function is given by

1stringCost :: String -> String -> Int 2stringCost s1 s2 = sum $zipWith (elemCost) s1 s2  The chromCost function compares two chromosomes, gene by gene and bit by bit, and sums the total number of unequal bits. Again, we make use of zipWith, this time zipping with our ready-made geneCost function: 1chromCost :: Chromosome -> Chromosome -> Int 2chromCost c1 c2 = sum$ zipWith (geneCost) c1 c2 

Example usage in ghci:

1*BinaryGA> let s1 = ~ac/dc~ 2*BinaryGA> let s2 = ~ac*ac~ 3*BinaryGA> let c1 = encodeString s1 4*BinaryGA> let c2 = encodeString s2 5*BinaryGA> c1 6[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,1,0],[0,0,0,0,1,1],[0,0,0,0,1,0]] 7*BinaryGA> c2 8[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,0,1],[0,0,0,0,0,0],[0,0,0,0,1,0]] 9*BinaryGA> stringCost s1 s2 102 11*BinaryGA> chromCost c1 c2 124 

##### 2.8.4 The chromCostPair function

As we shall see later, it is convenient to store the evaluated cost of a chromosome together with the chromosome itself in a tuple. For example, if we have a population of such tuples, we can sort it in increasing order of cost (ranking), which is necessary for selection, where we typically want to select better chromosomes before worse ones.

We therefore define a new type

1type ChromCost = (Int, Chromosome) Next, we define a chromCostPair function to construct such tuples: 1chromCostPair :: Chromosome -> Chromosome -> ChromCost 2chromCostPair target c = (chromCost target c, c) 

Using the variables defined above, we can test the function in ghci:

1*BinaryGA> chromCostPair c1 c2 2(4,[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,0,1],[0,0,0,0,0,0],[0,0,0,0,1,0]]) The first part of the tuple is the cost of c2, that is, the number of bits that differ from the target chromosome c1. The second part of the tuple is c2 itself.

#### 2.9 Population functions

We now turn our attention to some functions dealing with an entire population of chromosomes. The functions below are used to evaluate the cost of each chromosome in a population (evalPop); to sort a population in increasing order of cost (sortPop); to select chromosomes apart from elite chromosomes to keep for next generation and to use for mating (selection); get a list of chromosomes to be used as parents for mating (getParents); and to convert a list of (cost, chromosome) pairs, [ChromCost], to type Population.

##### 2.9.1 The evalPop function

The evalPop function compares all of the chromosomes in a population with a target chromosome (corresponding to the target string) and returns list of ChromCost. This is accomplished by mapping the partial function chromCostPair target over the population. This is an example of partial function application, where we call a function with too few parameters and get back a partially applied function, that is, a function that takes as many parameters as we left out.

1evalPop :: Chromosome -> Population -> [ChromCost] 2evalPop target pop = map (chromCostPair target) pop 

Example usage in ghci:

1*BinaryGA> let targetString = ~ac/dc~ 2*BinaryGA> let s1 = ~ac*ac~ 3*BinaryGA> let s2 = ~hello~ 4*BinaryGA> let s3 = ~a1234~ 5*BinaryGA> let s4 = ~ac/ac~ 6*BinaryGA> let target = encodeString targetString 7*BinaryGA> let pop = encodeStringList [s1,s2,s3,s4] 8*BinaryGA> let popEvaluated = evalPop target pop 9*BinaryGA> popEvaluated 10[(4,[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,0,1],[0,0,0,0,0,0],[0,0,0,0,1,0]]), 11 (11,[[0,0,0,1,1,1],[0,0,0,1,0,0],[0,0,1,0,1,1],[0,0,1,0,1,1],[0,0,1,1,1,0]]), 12 (13,[[0,0,0,0,0,0],[0,1,1,0,1,1],[0,1,1,1,0,0],[0,1,1,1,0,1],[0,1,1,1,1,0]]), 13 (2,[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,1,0],[0,0,0,0,0,0],[0,0,0,0,1,0]])] 

##### 2.9.2 The sortPop function

To sort a population, we can use the sort function from the Data.List library. We therefore add sort to our import statement:

1import Data.List (elemIndex, sort) The sortPop function is then implemented simply as 1sortPop :: [ChromCost] -> [ChromCost] 2sortPop pop = sort pop Note that this works because when given a list of tuples (ChromCost is a (cost, chromosome) pair, or tuple), sort sorts the list by the first element of the tuples, namely the cost.

Example usage in ghci using the evaluated population above:

1*BinaryGA> sortPop popEvaluated 2[(2,[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,1,0],[0,0,0,0,0,0],[0,0,0,0,1,0]]), 3 (4,[[0,0,0,0,0,0],[0,0,0,0,1,0],[1,0,1,1,0,1],[0,0,0,0,0,0],[0,0,0,0,1,0]]), 4 (11,[[0,0,0,1,1,1],[0,0,0,1,0,0],[0,0,1,0,1,1],[0,0,1,0,1,1],[0,0,1,1,1,0]]), 5 (13,[[0,0,0,0,0,0],[0,1,1,0,1,1],[0,1,1,1,0,0],[0,1,1,1,0,1],[0,1,1,1,1,0]])] 

##### 2.9.3 The selection function

In addition to numElite elite chromosomes, a number of other chromosomes must be kept in the population for the next generation and also serve the role as parents for mating. There are several ways to select such chromosomes, including roulette wheel selection, tournament selection, random selection, etc. Here, we just take a fraction xRate (called the selection rate) of a sorted (ranked) population. The total number of elite chromosomes plus selected chromosomes should equal numKeep. A possible implementation of a selection function is given below:

#### 2.10 Mating functions

We are finally ready to begin implementing the core components of the GA, namely mating, mutation, and evolution. We begin with two function required for mating, the single point crossover function and the matePairwise function.

##### 2.10.1 The crossover function

The function crossover is used for mating with single point crossover:

1crossover :: Int -> Chromosome -> Chromosome -> Population 2crossover cp ma pa = [take cp ma ++ drop cp pa, take cp pa ++ drop cp ma] Given a crossover point cp, a mother chromosome ma, and a father chromosome pa, it returns two offspring, where one consists of the genes in ma and pa before and after cp, respectively, and the other consists of the remaining genes from ma and pa.

Note that this implementation is probably not ideal, because it only deals with whole chunks of genes instead of bits. A better solution would be to allow for the crossover point cp to exist not only between genes but inside a gene too.

Example usage in ghci:

1*BinaryGA> let c1 = [[0,0,0,0,0,0,0],[1,1,1,1,1,1],[0,0,1,1,0,0],[0,1,0,1,0,1]] 2*BinaryGA> let c2 = [[0,0,0,0,0,0,1],[0,0,0,0,1,0],[0,0,0,1,0,0],[0,0,1,0,0,0]] 3*BinaryGA> crossover 0 c1 c2 4[[[0,0,0,0,0,0,1],[0,0,0,0,1,0],[0,0,0,1,0,0],[0,0,1,0,0,0]], 5 [[0,0,0,0,0,0,0],[1,1,1,1,1,1],[0,0,1,1,0,0],[0,1,0,1,0,1]]] 6*BinaryGA> crossover 1 c1 c2 7[[[0,0,0,0,0,0,0],[0,0,0,0,1,0],[0,0,0,1,0,0],[0,0,1,0,0,0]], 8 [[0,0,0,0,0,0,1],[1,1,1,1,1,1],[0,0,1,1,0,0],[0,1,0,1,0,1]]] 9*BinaryGA> crossover 2 c1 c2 10[[[0,0,0,0,0,0,0],[1,1,1,1,1,1],[0,0,0,1,0,0],[0,0,1,0,0,0]], 11 [[0,0,0,0,0,0,1],[0,0,0,0,1,0],[0,0,1,1,0,0],[0,1,0,1,0,1]]] 12*BinaryGA> crossover 3 c1 c2 13[[[0,0,0,0,0,0,0],[1,1,1,1,1,1],[0,0,1,1,0,0],[0,0,1,0,0,0]], 14 [[0,0,0,0,0,0,1],[0,0,0,0,1,0],[0,0,0,1,0,0],[0,1,0,1,0,1]]] 15*BinaryGA> crossover 4 c1 c2 16[[[0,0,0,0,0,0,0],[1,1,1,1,1,1],[0,0,1,1,0,0],[0,1,0,1,0,1]], 17 [[0,0,0,0,0,0,1],[0,0,0,0,1,0],[0,0,0,1,0,0],[0,0,1,0,0,0]]] We observe that for cp == 0, we just clone the parents, as also happens for cp greater than 4. Otherwise, for cp equal to 1, 2, or 3, the offspring is created by a crossover point after gene number 1, 2, or 3, respectively.

##### 2.10.2 The matePairwise function

There are many ways to choose which chromosomes should mate to create offspring, e.g., we could randomly draw a mother and a father chromosome from a subpopulation consisting of the numKeep best chromosomes in a population. Here, we simply use pairwise mating in a recursive function called matePairwise, where chromosomes 1 and 2 mate, chromosomes 3 and 4 mate, and so forth. The resulting offspring is returned as a Population together with a StdGen, both in a tuple. The function requires several standard PRNGs. For this, we add the split function that comes with the System.Random library in our import:

1import System.Random (StdGen, randomR, split, mkStdGen) The implementation of matePairwise is given below: 1matePairwise :: StdGen -> Population -> (Population, StdGen) 2matePairwise g [] = ([], g) 3matePairwise g [ma] = ([ma], g) 4matePairwise g (ma:pa:cs) = (offspring ++ fst (matePairwise g’ cs), g’’) 5    where (g’, g’’) = split g 6          (g’’’, _) = split g’’ 7          cp = fst $randIndex g’’’ ma 8 offspring = crossover cp ma pa The two base cases say that an empty population should just return the empty list, and if the population only have a single chromosome, we should just clone it. The general case generates a random crossover point cp using the randIndex function and then calls the single point crossover function with cp and the first two chromosomes in the population to create two offspring. It then recursively repeats the process on the remainder of the population. Example usage in ghci: 1*BinaryGA> let stringList = [~abc~,~def~,~ghj~,~klm~,~nop~,~qrs~,~tuv~,~wxy~] 2*BinaryGA> let pop = encodeStringList stringList 3*BinaryGA> pop 4[[[0,0,0,0,0,0],[0,0,0,0,0,1],[0,0,0,0,1,0]], 5 [[0,0,0,0,1,1],[0,0,0,1,0,0],[0,0,0,1,0,1]], 6 [[0,0,0,1,1,0],[0,0,0,1,1,1],[0,0,1,0,0,1]], 7 [[0,0,1,0,1,0],[0,0,1,0,1,1],[0,0,1,1,0,0]], 8 [[0,0,1,1,0,1],[0,0,1,1,1,0],[0,0,1,1,1,1]], 9 [[0,1,0,0,0,0],[0,1,0,0,0,1],[0,1,0,0,1,0]], 10 [[0,1,0,0,1,1],[0,1,0,1,0,0],[0,1,0,1,0,1]], 11 [[0,1,0,1,1,0],[0,1,0,1,1,1],[0,1,1,0,0,0]]] 12*BinaryGA> let (newPop, g’) = matePairwise (mkStdGen 99) pop 13*BinaryGA> newPop 14[[[0,0,0,0,0,0],[0,0,0,0,0,1],[0,0,0,1,0,1]], 15 [[0,0,0,0,1,1],[0,0,0,1,0,0],[0,0,0,0,1,0]], 16 [[0,0,0,1,1,0],[0,0,1,0,1,1],[0,0,1,1,0,0]], 17 [[0,0,1,0,1,0],[0,0,0,1,1,1],[0,0,1,0,0,1]], 18 [[0,0,1,1,0,1],[0,1,0,0,0,1],[0,1,0,0,1,0]], 19 [[0,1,0,0,0,0],[0,0,1,1,1,0],[0,0,1,1,1,1]], 20 [[0,1,0,0,1,1],[0,1,0,1,0,0],[0,1,1,0,0,0]], 21 [[0,1,0,1,1,0],[0,1,0,1,1,1],[0,1,0,1,0,1]]] We observe that the crossover point for chromosomes 1 and 2 in pop was after the second gene; for chromosomes 3 and 4 it was after the first gene; for chromosomes 5 and 6 it was after the first gene; and for chromosomes 7 and 8 it was after the second gene. #### 2.11 Mutation functions Mutation involves flipping a bit in a gene from zero to one or vice versa. The fraction (mutation rate) of chromosomes in the population that should mutate is called mutRate, and corresponds to the integer numMut. For example, for a population size of numPop == 100 and a mutation rate of mutRate == 0.1, the number of chromosomes to mutate would be numMut == 10. The necessary mutation functions are given below. ##### 2.11.1 The replaceAtIndex and flipBit functions If we have a list of bits (such as a gene), we cannot just change (overwrite) one of the bits in that list, since data variables in a purely functional language like Haskell are immutable (they cannot change once defined). Instead, we must copy the data we need and put it together in a new data structure (such as a list). We therefore implement a helper function that can replace an item in a list at a particular index, namely the replaceAtIndex function below: 1replaceAtIndex :: Int -> a -> [a] -> [a] 2replaceAtIndex n item ls = as ++ (item:bs) where (as, (b:bs)) = splitAt n ls This function uses the splitAt function available in Prelude (so no need to import it) to split the list ls at the position n into two sublists as and (b:bs), where as are the first n elements in ls, and (b:bs) are the remainding elements. The element b is then replaced with item and the pieces are put together again using the ++ function. Next, we implement a flipBit function that uses the replaceAtIndex function to flip a bit at an index n in a gene g: 1flipBit :: Int -> Gene -> Gene 2flipBit n g = replaceAtIndex n b g 3 where b | g!!n == 0 = 1 4 | otherwise = 0  Example usage in ghci: 1*BinaryGA> replaceAtIndex 5 99 [0,1,2,3,4,5,6,7,8,9] 2[0,1,2,3,4,99,6,7,8,9] 3*BinaryGA> let g = [1,1,0,1,1,1] 4*BinaryGA> flipBit 2 g 5[1,1,1,1,1,1]  ##### 2.11.2 The mutateChrom function The mutateChrom function mutates a randomly selected bit among its genes: 1mutateChrom :: StdGen -> Chromosome -> (Chromosome, StdGen) 2mutateChrom g chrom = (mutChrom, g2’) 3 where (g1, g2) = split g 4 (nGene, g1’) = randIndex g1 chrom 5 oldGene = chrom!!nGene 6 (nBit, g2’) = randIndex g2 oldGene 7 mutGene = flipBit nBit oldGene 8 mutChrom = replaceAtIndex nGene mutGene chrom The function looks slightly messy (feel free to improve it!), with several PRNGs that are spawned via the split function and the many where clauses. An index nGene is picked randomly and used to access the particular gene at that index in the chromosome chrom and assign it to oldGene. A random bit in oldGene at index nBit is then flipped and the mutated gene is assigned to mutGene. The mutated gene mutGene then replaces the original oldGene in chrom and is assigned to mutChrom and returned. Example usage in ghci: 1*BinaryGA> let c = [[0,0,0,0,0,0],[1,1,1,1,1,1],[1,1,1,0,0,0],[0,0,0,1,1,1]] 2*BinaryGA> let (mutatedChrom, g) = mutateChrom (mkStdGen 99) c 3*BinaryGA> mutatedChrom 4[[0,0,0,0,0,0],[1,1,1,1,1,1],[1,1,1,0,0,0],[0,0,0,1,0,1]] 5*BinaryGA> let (mutatedChrom, g) = mutateChrom (mkStdGen 98) c 6*BinaryGA> mutatedChrom 7[[0,0,0,0,0,0],[1,1,1,1,1,0],[1,1,1,0,0,0],[0,0,0,1,1,1]] Using a PRNG obtained from mkdStdGen 99, the fifth bit of the fourth gene is mutated (flipped), whereas using a PRNG obtained from mkdStdGen 98, the last bit of the second gene is mutated. ##### 2.11.3 The mutateChromInPop function The function mutateChromInPop mutates a chromosome at index n in population pop: 1mutateChromInPop :: StdGen -> Int -> Population -> (Population, StdGen) 2mutateChromInPop g n pop = (replaceAtIndex n mutChrom pop, g’) 3 where (g’, g’’) = split g 4 (mutChrom, _) = mutateChrom g’’ (pop!!n))  ##### 2.11.4 The mutatePop function Finally, we need a function called mutatePop that given a list of indices, mutates all its chromosomes at those indices. To randomly generate a list of indices, we create a function mutIndices: 1mutIndices :: Population -> StdGen -> [Int] 2mutIndices pop g = take numMut$ randomRs (numElite, length pop - 1) g The function makes use of the randomRs function from the System.Random library, hence we add it to our import statement: 1import System.Random (StdGen, randomR, randomRs, split, mkStdGen) The randomRs function produces an infinite list of random indices limited to lower and upper bounds given by it first argument. Here, the lower bound is numElite, because we do not want to mutate any of the elite chromosomes, and the upper bound is the index of the last chromosome in the population. Finally, we take only the first numMut indices from the infinite list.

Now that we have a a function to generate a list of random indices for the chromosomes that shall be mutated, we can implement a recursive mutatePop function:

1mutatePop :: StdGen -> [Int] -> Population -> (Population, StdGen) 2mutatePop g _ [] = ([], g) 3mutatePop g [] pop = (pop, g) 4mutatePop g (n:ns) pop = mutatePop g’ ns pop’ 5    where (pop’, g’) = mutateChromInPop g n p The first base case says to do nothing if the population is empty or the list of indices is empty. Given a list (n:ns) of indices, the recursive case uses the mutateChromInPop function defined above to mutate the chromosome at index n in the population before it recursively continues with the remaining ns indices. A list of random indices can be provided by the mutIndices function.

Example usage in ghci:

1*BinaryGA> pop 2[[[0,0,0,0,0,0],[0,0,0,0,0,1],[0,0,0,0,1,0]], 3 [[0,0,0,0,1,1],[0,0,0,1,0,0],[0,0,0,1,0,1]], 4 [[0,0,0,1,1,0],[0,0,0,1,1,1],[0,0,1,0,0,1]], 5 [[0,0,1,0,1,0],[0,0,1,0,1,1],[0,0,1,1,0,0]], 6 [[0,0,1,1,0,1],[0,0,1,1,1,0],[0,0,1,1,1,1]], 7 [[0,1,0,0,0,0],[0,1,0,0,0,1],[0,1,0,0,1,0]], 8 [[0,1,0,0,1,1],[0,1,0,1,0,0],[0,1,0,1,0,1]], 9 [[0,1,0,1,1,0],[0,1,0,1,1,1],[0,1,1,0,0,0]]] 10*BinaryGA> numMut 114 12*BinaryGA> let mutIdx = mutIndices pop (mkStdGen 99) 13*BinaryGA> mutIdx 14[7,5,6,3] 15*BinaryGA> let (mutPop, g’) = mutatePop (mkStdGen 99) mutIdx pop 16*BinaryGA> mutPop 17[[[0,0,0,0,0,0],[0,0,0,0,0,1],[0,0,0,0,1,0]], 18 [[0,0,0,0,1,1],[0,0,0,1,0,0],[0,0,0,1,0,1]], 19 [[0,0,0,1,1,0],[0,0,0,1,1,1],[0,0,1,0,0,1]], 20 [[0,0,1,0,1,0],[0,0,0,0,1,1],[0,0,1,1,0,0]], 21 [[0,0,1,1,0,1],[0,0,1,1,1,0],[0,0,1,1,1,1]], 22 [[0,1,0,0,0,0],[0,1,0,1,0,1],[0,1,0,0,1,0]], 23 [[0,0,0,0,1,1],[0,1,0,1,0,0],[0,1,0,1,0,1]], 24 [[0,1,0,1,1,0],[1,1,0,1,1,1],[0,1,1,0,0,0]]] Because numMut == 4, four random chromosomes in pop are mutated. The indices of these four chromosomes is determined by calling the mutIndices function, which returns mutIdx = [7,5,6,3]. By comparing the mutated population mutPop with the original population pop, we observe that for the chromosome at index 3, the third bit in the second gene was mutated; for the chromosome at index 5, the fourth bit in the second gene was mutated; for the chromosome at index 6, the second bit in the first gene was mutated; and for the chromosome at index 7, the first bit in the second gene was mutated.

#### 2.12 Evolution functions

Our GA is almost finished but the most important step is left: evolution. We need two functions, evolvePopOnce and evolvePop, to complete the GA.

##### 2.12.1 The evolvePopOnce function

The evolvePopOnce function evolves a population from one generation to the next:

1evolvePopOnce :: StdGen -> Population -> (Population, StdGen) 2evolvePopOnce g pop = (newPopMutated, g4) 3    where (g’, g’’) = split g 4          ePop = evalPop (encodeString target) pop 5          sPop = sortPop ePop 6          parents = toPopulation $getParents sPop 7 (offspring, g3) = matePairwise g’ parents 8 newPop = parents ++ offspring 9 mutIdx = mutIndices newPop g3 10 (newPopMutated, g4) = mutatePop g’’ mutIdx newPop Its input is a population pop and the output is a new population newPopMutated that has been constructed through genetic operations. Each of the chromosomes in pop is evaluated with respect to the target string target and the evaluated chromosomes are stored in a list ePop. Since the chromosomes in ePop have an associated cost, they can be sorted into a new list sPop. Parents to be kept for the next generation and for generating offspring is selected using the getParents function and stored in parents. The function matePairwise then create offspring using single point crossover on the chromosomes in parents. The parents and the offspring collectively become the new population newPop. Finally, a number numMut of the chromosomes in newPop are mutated and the results is the next generation newPopMutated. Here is an example usage in ghci for a target=s2 string, where s2="abc", and the following GA settings: 1numPop = 8 :: Int -- Population size (number of chromosomes) 2xRate = 0.5 :: Double -- Selection rate 3mutRate = 0.5 :: Double -- Mutation rate 4numElite = 2 :: Int -- Number of elite chromosomes  1*BinaryGA> let stringList = [~xxx~,~def~,~ghj~,~klm~,~nop~,~qrs~,~tuv~,~wxy~] 2*BinaryGA> let pop = encodeStringList stringList 3*BinaryGA> let (popEvolvedOnce,g’) = evolvePopOnce (mkStdGen 99) pop 4*BinaryGA> decodePopulation popEvolvedOnce 5[~qrs~,~def~,~efj~,~k+m~,~ref~,~drs~,~glm~,~khj~]  ##### 2.12.2 The evolvePop function Finally, we create the evolvePop function. This function evolves a population n times recursively, making use of the evolvePopOnce function just described. 1evolvePop :: StdGen -> Int -> Population -> (Population, StdGen) 2evolvePop g 0 pop = (newpop, g) 3 where newpop = toPopulation$ sortPop \$ evalPop (encodeString target) pop 4evolvePop g n pop = evolvePop g’ (n-1) newPop 5    where (newPop, g’) = evolvePopOnce g pop 

Example usage in ghci:

1*BinaryGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 10 pop 2*BinaryGA> decodePopulation popEvolvedNTimes 3[~qbs~,~qbs~,~azs~,~azs~,~azs~,~azs~,~qjs~,~iza~] 4*BinaryGA> let (popEvolvedNTimes,g’) = evolvePop (mkStdGen 99) 20 pop 5*BinaryGA> decodePopulation popEvolvedNTimes 6[~abc~,~aac~,~aac~,~aac~,~abs~,~abs~,~ebs~,~ajq~] We observe that 10 generations was not sufficient to find (learn) the target string, whereas for 20 generations, the population evolved and the best chromosome was identical to the target, decoded as "abc".

#### 2.13 Final remarks

Congratulations! You should now have a working GA contained in your BinaryGA module. Even if you are generous with comments and line shifts (as I am), your module should be less than 300 lines.

What remains is to test the GA. We will investigate this in the next section.

Last updated: 7th April 2017
Robin T. Bye / robin.t.bye@ntnu.no