Functional Programming and Intelligent Algorithms

# Tutorial: GA for TSP

## Week 14: The travelling salesperson problem (TSP)

### 1 Tutorial: GA for TSP

In this tutorial, we will implement an integer-based GA for TSP based on the continuous GA that we implemented previously. A lot of the continuous GA code can be reused as is or just slightly modified. Therefore, this tutorial will necessarily contain much of the same information as the tutorial on the continuous GA.

The purpose of this GA is solve problems that are formulated as TSPs. That is, given a set of cities in the Euclidean 2D space, starting at any city, the GA should find the shortest connected path visiting all cities exactly once before returning the original city. A chromosome should represent a candidate solution, that is, a sample permutation (sequence of cities). We will assume that $n$ cities are labeled as natural numbers, $1,2,\dots ,n$. For example, for $n=7$ cities, a candidate solution could be given by $\left[4,6,2,7,3,1,5\right]$. The cost function is the total distance travelled when visiting the cities of such a permutation.

However, because a TSP tour cannot include repeated cities, ordinary crossover and mutation will not work, because these genetic operations may introduce duplicates in offspring. There are several ways to fix this problem. For example, we can write specialised crossover and mutation methods that checks if duplicates will occur and avoids it (e.g., the PMX method). Here, we will use the method from a paper by ?, where a one-to-one relationship between a permutation and its inverse is presented. The inverse is allowed to have duplicates. Therefore, we can encode our chromosomes as the inverse of the permutations they represent, and use our ordinary crossover and mutation methods as before. To evalute the total distance travelled (cost), we must decode chromosomes to permutations and use a distance function on the resulting permutation.

#### 1.1 Test problems

We will implement a GA for TSP using the alternative chromosome encoding proposed in ? and try to solve the following TSPs (taken from
www.math.uwaterloo.ca/tsp/world/countries.html):

• 29 cities in Western Sahara:
• wi29info.txt: Information about problem and list of cities with (x,y) coordinates.
• wi29.txt Stripped file with list of cities with (x,y) coordinates only.
• 38 cities in Djibouti:
• dj38info.txt: Information about problem and list of cities with (x,y) coordinates.
• dj38.txt Stripped file with list of cities with (x,y) coordinates only.

For motivation, we will run an informal competition to see who manages to get the shortest TSP distance..!

You should place both the stripped files in the same directory as your Haskell module.

#### 1.2 Test module

To show how you can read in the test data, we first provide a test module that can be used later when you have finished the implementation of your TSP GA:

1module TestTSPGA where 2 3import TSPGA 4import Data.List (sort) 5import System.Random (newStdGen) 6 7main :: IO () 8main = do 9        contents <- readFile ~wi29.txt~ 10        let s = map words (lines contents) 11        let cities = map stringsToCity s 

This test module makes use of a helper function:

1stringsToCity :: [String] -> City 2stringsToCity [c, x, y] = (read c, (read x, read y)) The function stringsToCity converts a list of three strings of the format [cityid, xpos, ypos] to a City.

#### 1.3 A continuous GA for the TSP

We begin by creating a module TSPGA.hs for our code:

1module TSPGA where 

#### 1.4 GA settings

Before we proceed, it is useful to add some GA parameter settings to the top of our file so that they can easily be found and tweaked when we will test the GA later. We adopt the terminology used in class and in ? and define the following parameters with some default settings:

1numPop = 100  :: Int    -- Population size (number of chromosomes) 2xRate = 0.5   :: Double -- Selection rate 3mutRate = 0.5 :: Double -- Mutation rate 4numElite = 20 :: Int    -- Number of elite chromosomes 5itMax = 500   :: Int    -- Max number of iterations Note that these settings may be good or bad, and you will likely need to experiment to obtain good solutions.

#### 1.5 Genes, chromosomes, and population

In the following, we will assume that a gene is an integer that encodes a single number in an inverse sequence corresponding to a permutation, and that a chromosome is a list of genes that encodes the complete sequence.

The total number of (encoding) variables, numVar, is equal to the number of cities in t test problem we will try to solve:

1numVar = 29 :: Int -- replace with length of a list of cities This number tells us how many genes are required in each chromosome. Here, we have hardcoded the number to suit the Djibouti problem with 29 cities, and you must change it to 38 if solving the Western Sahara problem.

Next, we need to know the number of chromosomes numKeep to keep between generations. This number is a fraction (the selection rate xRate) of the population size numPop and should be rounded up to the nearest even number to simplify the mating procedure:

1numKeep | numKeep’’ > numPop = numPop - numPop ‘mod‘ 2 2        | numKeep’’ < 2 = 2 3        | otherwise = numKeep’’ 4        where numKeep’’ = numKeep’ + numKeep’ ‘mod‘ 2 5              numKeep’ = ceiling $xRate * fromIntegral numPop :: Int For example, for numPop = 100 and xRate = 0.5, we get numKeep = 50. Some cases are introduced for safety to ensure numKeep is always greater than or equal to 2, and always smaller than or equal to numPop. We also need to determine the number of chromosomes to mutate, numMut, which is a fraction (the mutation rate mutRate) of the population size, and must be rounded up to nearest integer: 1numMut = ceiling$ mutRate * fromIntegral numPop :: Int For numPop = 40 and mutRate = 0.1, we get numMut = 4.

Finally, we define some type aliases for readability and debugging purposes:

1type CityID     = Int                -- city id 2type Position   = (Double, Double)   -- city position (x, y) 3type City       = (CityID, Position) -- (city id, x-coordinate, y-coordinate) 4type Gene       = CityID             -- integer 5type Chromosome = [Gene]             -- list of genes 6type Population = [Chromosome]       -- list of chromosomes A CityID is an integer representing a city, whilst Position is a tuple consisting of the $x$ and $y$ coordinates of a city in a Euclidean space. A City is a tuple consisting of a CityID and the Position of that city. We simply let a Gene be a CityID, a Chromosome a list of Gene, and a Population a list of Chromosome.

The variables (city IDs) are constrained as

1pLo = 1      :: CityID 2pHi = numVar :: CityID whilst we let the genes have the same constraints. 1gLo = pLo :: CityID 2gHi = pHi :: CityID The reason we have the same constraints on the genes is that the inverse sequence (an encoded chromosome) of a permutation will contain only (some of) the same numbers as those of the permutation. Note that if we planned to use some other encoding scheme, we would have to change gLo and gHi accordingly.

#### 1.6 Encoding functions

##### 1.6.1 The encodeCity function

We need a function encodeCity to encode a City as a Gene:

1encodeCity :: City -> Gene 2encodeCity (cityID, cityPos) = cityID 

##### 1.6.2 Direct encoding using the encodeTour’ function

If we encode a TSP tour (list of cities) directly to a chromosome, we could use the function encodeTour’:

1encodeTour’ :: [City] -> Chromosome 2encodeTour’ = map encodeCity However, as we have discussed already, this would cause problems with repeated cities when performing ordinary crossover and mutation.

##### 1.6.3 The inverseSequence function

Instead, we implement a function inverseSequence that creates a sequence of numbers representing a permutation:

1-- Inverse sequence representing a permutation 2-- This sequence allows repetitive values and hence is robust under ordinary 3-- (n-point) crossover. There is a one-to-one mapping from ordinary permutation 4-- representation to the inversion sequence representation. 5-- Source: http://www.ceng.metu.edu.tr/~ucoluk/research/publications/tspnew.pdf 6inverseSequence :: [CityID] -> [Int] 7inverseSequence cs = inverseSequence’ cs ns 8    where ns = [1..length cs] 9 10inverseSequence’ :: [CityID] -> [Int] -> [Int] 11inverseSequence’ cs [j] = [length $filter (> j)$ takeWhile (/= j) cs] 12inverseSequence’ cs (j:js) = inverseSequence’ cs [j] ++ inverseSequence’ cs js This sequence is allowed to have duplicates. Later, we implement a fromInverse function to get back to a permutation from such an inverse sequence.

##### 1.6.4 The encodeTour function

To encode a TSP tour (permuation of cities) as an inverse sequence in the chromosome, we compose a function consisting of inverseSequence and the direct encoding function encodeTour’:

1-- Encode a tour (permutation of cities) as an inverse sequence 2encodeTour :: [City] -> Chromosome 3encodeTour = inverseSequence . encodeTour’ 

##### 1.6.5 The encodeTours function

Finally, the encodeTours function encodes a list of tours as a list of inverse sequences:

1encodeTours :: [[City]] -> Population 2encodeTours = map encodeTour 

#### 1.7 Decoding functions

In addition to encoding functions, we also need decoding functions able to convert back from inverse sequences contained in chromosomes back to TSP tours (permutations).

##### 1.7.1 The fromInverse function

We define a fromInverse function that converts an inverse sequence to a permutation:

1-- Convert inverse sequences generated with inverseSequence to permutations 2-- Source: http://www.ceng.metu.edu.tr/~ucoluk/research/publications/tspnew.pdf 3fromInverse :: [Int] -> [Int] 4fromInverse inv = map snd $filter ((>0) . fst)$ sort $zip permpos [0..] 5 where permpos = updatePosMany inv0 pos0 i n 6 inv0 = 0 : inv 7 pos0 = take (length inv0 + 1)$ repeat 0 8          i = length inv 9          n = i It is slightly hard work to implement the fromInverse function based on the second iterative algorithm in ?, and a number of helper functions are required (you may be able to do this in a simpler way yourself): 1-- Helper functions to fromInverse 2updatePosMany :: [Int] -> [Int] -> Int -> Int -> [Int] 3updatePosMany inv pos i n 4            | i < 1 = pos 5            | otherwise = updatePosMany inv (updatePos inv pos i n) (i - 1) n 6 7updatePos :: [Int] -> [Int] -> Int -> Int -> [Int] 8updatePos inv pos i n = updatePosi inv (updatePosmMany inv pos (i + 1) n i) i 9 10updatePosi :: [Int] -> [Int] -> Int -> [Int] 11updatePosi inv pos i = replaceAtIndex i (inv!!i + 1) pos 12 13updatePosmMany :: [Int] -> [Int] -> Int -> Int -> Int -> [Int] 14updatePosmMany inv pos m n i 15        | m == n    = updatePosm inv pos m i 16        | m > n     = updatePosmMany inv (updatePosm inv pos m i) (m - 1) n i 17        | otherwise = updatePosmMany inv (updatePosm inv pos m i) (m + 1) n i 18 19updatePosm :: [Int] -> [Int] -> Int -> Int -> [Int] 20updatePosm inv pos m i 21            | pos!!m >= inv!!i + 1 = replaceAtIndex m (pos!!m + 1) pos 22            | otherwise = pos 

#### 1.8 Testing the inverseSequence and fromInverse functions

We can test the inverseSequence and fromInverse functions on two permutations ${p}_{1}$ and ${p}_{2}$ with corresponding inverse sequences ${i}_{1}$ and ${i}_{2}$, respectively, given in ?:

$\begin{array}{lll}\hfill {p}_{1}& =\left[5,7,1,3,6,4,2\right]↔{i}_{1}=\left[1,5,2,3,0,1,0\right]\phantom{\rule{2em}{0ex}}& \hfill \text{(1)}\\ \hfill {p}_{2}& =\left[4,6,2,7,3,1,5\right]↔{i}_{2}=\left[5,2,3,0,2,0,0\right]\phantom{\rule{2em}{0ex}}& \hfill \text{(2)}\end{array}$

Example usage in ghci:

1*TSPGA> let perm1 = [5,7,1,3,6,4,2] 2*TSPGA> let perm2 = [4,6,2,7,3,1,5] 3*TSPGA> let invseq1 = inverseSequence perm1 4*TSPGA> let invseq2 = inverseSequence perm2 5*TSPGA> invseq1 6[2,5,2,3,0,1,0] 7*TSPGA> invseq2 8[5,2,3,0,2,0,0] 9*TSPGA> let perm1’ = fromInverse invseq1 10*TSPGA> let perm2’ = fromInverse invseq2 11*TSPGA> perm1’ 12[5,7,1,3,6,4,2] 13*TSPGA> perm2’ 14[4,6,2,7,3,1,5] 

##### 1.8.1 The decodeGene function

There is no way to decode a gene by itself, since a gene is a number in an inverse sequence in which the elements depend on each other. We need to decode an entire chromosome to find the permutation it represent; we cannot decode separate genes. Also, there is no need for a decodeGene function. Hence, this function is not implemented.

##### 1.8.2 The decodeChromosome function

To decode a Chromosome, we just call the fromInverse function:

1decodeChromosome :: Chromosome -> [CityID] 2decodeChromosome = fromInverse 

##### 1.8.3 The decodePopulation function

To decode an entire population (list of chromosomes) of chromosomes, we just map the decodeChromosome function over the population:

1decodePopulation :: Population -> [[CityID]] 2decodePopulation = map decodeChromosome 

#### 1.9 Testing of encoding/decoding

Example functionality in ghci:

1*TSPGA> let c1 = (1,(0,0)) :: City 2*TSPGA> let c2 = (2,(1,1)) :: City 3*TSPGA> let c3 = (3,(1,2)) :: City 4*TSPGA> let c4 = (4,(2,1)) :: City 5*TSPGA> let tour1 = [c1,c2,c3,c4] 6*TSPGA> let tour2 = [c2,c3,c1,c4] 7*TSPGA> let tour3 = [c3,c1,c2,c4] 8*TSPGA> let tour4 = [c4,c1,c3,c2] 9*TSPGA> let tours = [tour1,tour2,tour3,tour4] 10*TSPGA> let pop = encodeTours tours 11*TSPGA> pop 12[[0,0,0,0],[2,0,0,0],[1,1,0,0],[1,2,1,0]] 13*TSPGA> decodePopulation pop 14[[1,2,3,4],[2,3,1,4],[3,1,2,4],[4,1,3,2]] 

#### 1.10 Randomness functions

Randomness is vital for a GA, e.g., to create a random initial population, a random crossover point, a random mutation, and so on. We will use two functions from the System.Random library to implement the functions we need for randomness, namely StdGen and randomR:

1import System.Random (StdGen, randomR) It will be convenient to have functions for generating random genes, chromosomes, and populations; and also for generating a random index (e.g., a crossover point) in a list.

Note that the reason we also return a StdGen in many or most of the functions dealing with randomness is so that we can repeatedly apply them, e.g., in a genetic evolution. In such cases, we need to pass on a StdGen for the next function calls.

##### 1.10.1 The randGene function

The randGene function generates a random number of type Gene in the range $\left[g\mathrm{\text{low}},g\mathrm{\text{high}}\right]$:

1randGene :: StdGen -> (Gene, StdGen) 2randGene g = (value, g’) 3    where (value, g’) = randomR (gLo, gHi) g 

##### 1.10.2 The randGenes function

The randGenes function uses the randGene function to generate a list of random genes with length n:

##### 1.11.2 The distanceTour function

The distanceTour function calculates the total distance of a TSP tour. We first connect the end city back to the starting city, and then use a helper function distanceTour’ that adds up all the intercity distances:

1-- Distance of a TSP tour 2distanceTour :: [City] -> Double 3distanceTour cs = distanceTour’ cs’ 4    where cs’ = cs ++ [head cs] 5 6-- Distance of a tour where the end city is not connected back to the beginning 7distanceTour’ :: [City] -> Double 8distanceTour’ [] = 0 9distanceTour’ [c1,c2] = distance c1 c2 10distanceTour’ (c1:c2:cs) = distance c1 c2 + distanceTour’ (c2:cs) 

##### 1.11.3 The cityIDToCity function

The cityIDToCity function looks up a CityID in a list of cities [City] and returns the city if it exists, otherwise it returns an error:

#### 1.14 Mutation functions

Mutation involves changing a gene value to a new random value limited to $g\mathrm{\text{low}}$and $g\mathrm{\text{high}}$. The fraction (mutation rate) of chromosomes in the population that should mutate is called mutRate, and corresponds to the integer numMut. For example, for a population size of numPop == 100 and a mutation rate of mutRate == 0.1, the number of chromosomes to mutate would be numMut == 10.

The necessary mutation functions are given below.

##### 1.14.1 The replaceAtIndex function

If we have a gene variable, or a list of genes (a chromosome), we cannot just change (overwrite) it as we likely would do in an imperative language, since data variables in a purely functional language like Haskell are immutable (they cannot change once defined). Instead, we must copy the data we need and put it together in a new data structure or variable. We therefore implement a helper function that can replace an item in a list at a particular index, namely the replaceAtIndex function below:

1replaceAtIndex :: Int -> a -> [a] -> [a] 2replaceAtIndex n item ls = as ++ (item:bs) where (as, (b:bs)) = splitAt n ls This function uses the splitAt function available in Prelude (so no need to import it) to split the list ls at the position n into two sublists as and (b:bs), where as are the first n elements in ls, and (b:bs) are the remainding elements. The element b is then replaced with item and the pieces are put together again using the ++ function.

Example usage in ghci:

1*ContinuousGA> replaceAtIndex 5 99 [0,1,2,3,4,5,6,7,8,9] 2[0,1,2,3,4,99,6,7,8,9] 

##### 1.14.2 The mutateChrom function

The mutateChrom function mutates a randomly selected gene among its list of genes:

1mutateChrom :: StdGen -> Chromosome -> (Chromosome, StdGen) 2mutateChrom g chrom = (mutChrom, g2’) 3    where (g1, g2) = split g 4          (nGene, g1’) = randIndex g1 chrom 5          (mutGene, g2’) = randGene g2 6          mutChrom = replaceAtIndex nGene mutGene chrom An index nGene in the chromosome chrom is picked randomly. A new mutated gene mutGene at this index is then mutated using the randGene function. Finally, a new chromosome mutChrom that is identical to chrom except that the gene at index nGene has been mutated is returned.

##### 1.14.3 The mutateChromInPop function

The function mutateChromInPop mutates a chromosome at index n in population pop:

1mutateChromInPop :: StdGen -> Int -> Population -> (Population, StdGen) 2mutateChromInPop g n pop = (replaceAtIndex n mutChrom pop, g’) 3    where (g’, g’’) = split g 4          (mutChrom, _) = mutateChrom g’’ (pop!!n) 

##### 1.14.4 The mutIndices and mutatePop functions

Finally, we need a function called mutatePop that given a list of indices, mutates all its chromosomes at those indices.

To randomly generate a list of indices, we create a function mutIndices:

1mutIndices :: Population -> StdGen -> [Int] 2mutIndices pop g = take numMut $randomRs (numElite, length pop - 1) g The function makes use of the randomRs function from the System.Random library, hence we add it to our import statement: 1import System.Random (StdGen, randomR, randomRs, split, mkStdGen) The randomRs function produces an infinite list of random indices limited to lower and upper bounds given by it first argument. Here, the lower bound is numElite, because we do not want to mutate any of the elite chromosomes, and the upper bound is the index of the last chromosome in the population. Finally, we take only the first numMut indices from the infinite list. Now that we have a a function to generate a list of random indices for the chromosomes that shall be mutated, we can implement a recursive mutatePop function: 1mutatePop :: StdGen -> [Int] -> Population -> (Population, StdGen) 2mutatePop g _ [] = ([], g) 3mutatePop g [] pop = (pop, g) 4mutatePop g (n:ns) pop = mutatePop g’ ns pop’ 5 where (pop’, g’) = mutateChromInPop g n pop The first base case says to do nothing if the population is empty or the list of indices is empty. Given a list (n:ns) of indices, the recursive case uses the mutateChromInPop function defined above to mutate the chromosome at index n in the population before it recursively continues with the remaining ns indices. A list of random indices can be provided by the mutIndices function. #### 1.15 Evolution functions Our GA is almost finished but the most important step is left: evolution. We need two functions, evolvePopOnce and evolvePop, to complete the GA. ##### 1.15.1 The evolvePopOnce function The evolvePopOnce function evolves a population from one generation to the next: 1evolvePopOnce :: StdGen -> [City] -> Population -> (Population, StdGen) 2evolvePopOnce g cities pop = (newPopMutated, g4) 3 where (g’, g’’) = split g 4 parents = toPopulation$ getParents $sortPop$ evalPop cities pop 5          (offspring, g3) = matePairwise g’ parents 6          newPop = parents ++ offspring 7          mutIdx = mutIndices newPop g3 8          (newPopMutated, g4) = mutatePop g’’ mutIdx newPop Its input is a list of cities and their positions named cities and a population pop, and the output is a new population newPopMutated that has been constructed through genetic operations. Each of the chromosomes in pop is evaluated by finding the total distance of the TSP tour they each encode, and then sorted.

Parents to be kept for the next generation and for generating offspring are selected using the getParents function, and to convert from a list of (cost, chromosome) pairs, we use the toParents function. The result is assigned to parents. The function matePairwise then create offspring using single point crossover on the chromosomes in parents. The parents and the offspring collectively become the new population newPop. Finally, a number numMut of the chromosomes in newPop are mutated and the results is the next generation newPopMutated.

##### 1.15.2 The evolvePop function

Finally, we create the evolvePop function. This function evolves a population n times recursively, making use of the evolvePopOnce function just described.

1evolvePop :: StdGen -> Int -> [City] -> Population -> (Population, StdGen) 2evolvePop g 0 cities pop = (newpop, g) 3    where newpop = toPopulation $sortPop$ evalPop cities pop 4evolvePop g n cities pop = evolvePop g’ (n-1) cities newPop 5    where (newPop, g’) = evolvePopOnce g cities pop 

#### 1.16 Alternative encoding/decoding scheme

As pointed out by student Rune Valle (thank you!), an alternative encoding/decoding scheme for TSP problems can be found here: frcatel.fri.uniza.sk/users/pesko/publ/DeTSP.pdf

Please feel free to use this scheme if you like.

Below is a module with the necessary code:

1-- Alternative encoding/decoding scheme for TSP taken from 2-- https://frcatel.fri.uniza.sk/users/pesko/publ/DeTSP.pdf 3 4module GAtsmLehman where 5 6-- Test data 7perm1, perm2, invseq1, invseq2 :: [Int] 8perm1 = [5,7,1,3,6,4,2] 9perm2 = [4,6,2,7,3,1,5] 10invseq1 = [1,5,2,3,0,1,0] 11invseq2 = [5,2,3,0,2,0,0] 12 13type Permutation = [Int] 14type Lehman = [Int] 15 16encodePermutation :: Permutation -> Lehman 17encodePermutation [] = [] 18encodePermutation (ah:at) = (length \$ filter (<ah) at):(encodePermutation at) 19 20popItem :: Int -> [a] -> (a, [a]) 21popItem 0 (ah:at) = (ah, at) 22popItem i (ah:at) = (output, ah:list) 23    where 24    (output, list) = popItem (i-1) at 25 26identityPermutation :: Int -> Permutation 27identityPermutation a = [1..a] 28 29decodePermutation :: Lehman -> Permutation 30decodePermutation a = decodePermutation’ (identityPermutation (length a)) a 31    where 32    decodePermutation’ :: Permutation -> Lehman -> Permutation 33    decodePermutation’ [] [] = [] 34    decodePermutation’ n (lh:lt) = res:(decodePermutation’ nrest lt) 35        where 36        (res, nrest) = popItem lh n 

#### 1.17 Final remarks

Congratulations! You should now have a working GA contained in your TSPGA module. Even if you are generous with comments and line shifts (as I am), your module should be around 300 lines.

What remains is to test the GA. We will investigate this in the next section.

7th April 2017
Robin T. Bye / robin.t.bye@ntnu.no