Genetic algorithms (GA) are used in many fields to solve complex optimization problems. They are a type of evolutionary algorithm that mimics the process of natural selection to evolve solutions over time. In this post, we will explore how to implement a genetic algorithm in Go.
Introduction ¶
In the first example we will evolve a string to match a target word. This is a simple example that demonstrates the basic components of a genetic algorithm: population, fitness, selection, crossover, and mutation.
Population ¶
Population is a set of candidate solutions. In this example, each candidate is a string of the same length as the target word. The initial population is generated randomly. In real-world applications, the initial population can be seeded with known good solutions or generated using domain-specific heuristics.
The program defines some constants for the target word, the alphabet, population size, mutation rate, maximum generations, and tournament size. These constants are used throughout the genetic algorithm to control its behavior and parameters.
const (
target = "BRAIN"
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
populationSize = 100
mutationRate = 0.03
maxGenerations = 1000
tournamentSize = 4
)
type individual struct {
genome string
fitness int
}
The initial population is created by generating random strings:
func makeInitialPopulation() []individual {
population := make([]individual, populationSize)
for index := range population {
population[index] = individual{genome: randomGenome()}
}
return population
}
Population size is a key parameter that affects the performance of the genetic algorithm. A larger population provides more diversity and can help avoid local optima, but it also increases the computational cost of evaluating fitness and performing selection. A smaller population is faster to evaluate but may converge prematurely to suboptimal solutions. A common starting point is to use a population size of around 100, but this can be adjusted based on the specific problem and computational resources available.
Genome ¶
The genome is the encoded form of a candidate solution. In this example, the genome is simply the string itself. In more complex problems, the genome could be a binary string, a list of numbers, or any other data structure that can represent a solution. For this example, the genome is a five-character string.
A random genome is generated by picking random letters from the alphabet:
func randomGenome() string {
bytes := make([]byte, len(target))
for index := range bytes {
bytes[index] = randomLetter()
}
return string(bytes)
}
func randomLetter() byte {
return alphabet[rand.IntN(len(alphabet))]
}
Fitness ¶
Fitness is a measure of how good a candidate solution is. In this example, the fitness function counts how many characters in the candidate string match the target word in the correct position and how many characters are present in the candidate string but in the wrong position. The fitness score is higher for candidates that are closer to the target word. In more complex problems, the fitness function is more sophisticated and may involve multiple objectives or constraints.
The fitness function does two passes. The first pass awards 10 points for each correct letter in the correct position. The second pass awards 1 point for each correct letter present but in the wrong position, without double-counting:
func fitness(genome string) int {
score := 0
matchedGenome := make([]bool, len(target))
matchedTarget := make([]bool, len(target))
// First pass: correct letter at correct position -> 10 points
for index := range target {
if genome[index] == target[index] {
score += 10
matchedGenome[index] = true
matchedTarget[index] = true
}
}
// Second pass: correct letter at wrong position -> 1 point
for genomeIdx := range genome {
if matchedGenome[genomeIdx] {
continue
}
for targetIdx := range target {
if matchedTarget[targetIdx] {
continue
}
if genome[genomeIdx] == target[targetIdx] {
score += 1
matchedTarget[targetIdx] = true
break
}
}
}
return score
After evaluating fitness, the population is sorted so the fittest individual is always at index 0:
func evaluateAndSort(population []individual) {
for index := range population {
population[index].fitness = fitness(population[index].genome)
}
slices.SortFunc(population, func(left, right individual) int {
switch {
case left.fitness > right.fitness:
return -1
case left.fitness < right.fitness:
return 1
default:
return 0
}
})
}
Selection ¶
Selection is the process of choosing candidates from the population to be parents for the next generation. In this example, the best candidate is chosen (elitism), and the rest of the parents are selected using tournament selection. In tournament selection, a few candidates are randomly chosen from the population, and the one with the highest fitness among them is selected as a parent.
In this function, four candidates are randomly drawn from the population, and the one with the highest fitness is returned as the selected parent:
func tournamentSelect(population []individual) individual {
best := population[rand.IntN(len(population))]
for draw := 1; draw < tournamentSize; draw++ {
challenger := population[rand.IntN(len(population))]
if challenger.fitness > best.fitness {
best = challenger
}
}
return best
}
There are many different selection methods that can be used in genetic algorithms. Some common selection methods include:
- Tournament selection: Randomly select a few candidates and choose the one with the highest fitness.
- Roulette wheel selection: Select candidates with a probability proportional to their fitness.
- Rank selection: Rank candidates by fitness and select based on their rank.
- Elitism: Automatically carry the best candidates into the next generation without modification. This ensures that the best solution found so far is not lost due to crossover or mutation. But it can reduce diversity if overused, so it is often combined with other selection methods.
Crossover ¶
Crossover is the process of combining two parent candidates to produce one or more child candidates. In this example, we use a simple one-point crossover, where a random cut point is chosen and the child inherits the first part of one parent and the second part of the other parent.
func crossover(parentA string, parentB string) string {
cut := rand.IntN(len(target)-1) + 1
return parentA[:cut] + parentB[cut:]
}
There are many different crossover methods that can be used in genetic algorithms. Some common crossover methods include:
- One-point crossover: A random cut point is chosen and the child inherits the first part of one parent and the second part of the other parent.
- Two-point crossover: Two random cut points are chosen and the child inherits the middle section from one parent and the outer sections from the other parent. This method can preserve more structure than one-point crossover.
- Uniform crossover: Each gene in the child is randomly chosen from either parent with equal probability. This method can provide more mixing but may also disrupt building blocks.
- Problem-specific crossover: For some problems, custom crossover operators can be designed to preserve important structures or constraints. We will see in the TSP an example of a problem-specific crossover operator.
Mutation ¶
Mutation is the process of introducing random changes to a candidate solution. In this example, we randomly flip characters in the string with a certain mutation rate. Mutation helps maintain diversity in the population and allows the algorithm to explore new areas of the search space. However, if the mutation rate is too high, it can disrupt good solutions and make the search more random. If the mutation rate is too low, the search can get stuck in local optima.
func mutate(genome string) string {
bytes := []byte(genome)
for index := range bytes {
if rand.Float64() < mutationRate {
bytes[index] = randomLetter()
}
}
return string(bytes)
}
Possible mutation methods include:
- Bit flip mutation: For binary genomes, randomly flip bits with a certain probability.
- Swap mutation: For permutation genomes, randomly swap two genes. This is useful for problems like TSP where the genome represents an ordering.
- Gaussian mutation: For real-valued genomes, add a small random value drawn from a Gaussian distribution to each gene. This can help fine-tune solutions in continuous search spaces.
- Problem-specific mutation: Similar to crossover, custom mutation operators can be designed for specific problems to maintain important structures or constraints.
Main loop ¶
The main loop initializes the population and runs generation after generation. At each step it prints the best individual and checks whether the target has been reached:
func main() {
population := makeInitialPopulation()
evaluateAndSort(population)
for generation := 0; generation <= maxGenerations; generation++ {
best := population[0]
fmt.Printf("Generation %3d | best %s | fitness %d/%d\n", generation, best.genome, best.fitness, 10*len(target))
if best.genome == target {
fmt.Println("Evolution complete.")
return
}
population = nextGeneration(population)
evaluateAndSort(population)
}
best := population[0]
fmt.Printf("Stopped after %d generations. Best genome: %s (%d/%d)\n", maxGenerations, best.genome, best.fitness, 10*len(target))
}
The nextGeneration function builds the next population. The first slot is reserved for the current best individual (elitism). The remaining slots are filled by selecting two parents via tournament selection, crossing them over, and mutating the child:
func nextGeneration(population []individual) []individual {
next := make([]individual, populationSize)
next[0] = individual{genome: population[0].genome}
for index := 1; index < populationSize; index++ {
parentA := tournamentSelect(population)
parentB := tournamentSelect(population)
child := crossover(parentA.genome, parentB.genome)
next[index] = individual{genome: mutate(child)}
}
return next
}
The population size stays constant across generations; in this example it is always 100. The best candidate is always preserved due to elitism. Copying the best candidate into the next generation ensures that the best solution found so far is not lost due to crossover or mutation. This can help speed up convergence, but it can also reduce diversity in the population if overused.
The rest of the population is generated by selecting parents, crossing them over, and mutating the child. A solution can be selected multiple times as a parent, and the same solution can be selected as both parents.
Running the example ¶
You can run the example with this command:
cd helloworld
go run .
When you run the example multiple times, you will see that it takes a different number of generations to find the target word. This is because the genetic algorithm is completely stochastic. The initial population is random, and the selection, crossover, and mutation processes all involve randomness. As a result, the algorithm may find the target word in just a few generations in some runs, while in other runs it may take much longer.
Overview ¶
Here is a flowchart that summarizes the workflow of this simple genetic algorithm example:
Termination ¶
In the previous example, you have seen a termination condition that is based on a known target. In most real-world scenarios where a genetic algorithm is used, we do not know the solution in advance. That is why we use a heuristic search method like a genetic algorithm in the first place.
But a termination condition is still necessary. We can't search forever, and we usually want to stop when we have a good enough solution.
Common termination rules include:
- a fixed number of generations
- a plateau where the best fitness has not improved for a fixed number of generations
- a threshold where the best fitness is good enough for the practical goal
- convergence where most of the population has become effectively identical
- a domain specific condition such as reaching a certain distance in TSP or a certain score in a game
- a combination of the above
Use cases ¶
Genetic algorithms are most useful in large, noisy, awkward search spaces where brute force is unrealistic and a good-enough answer is valuable. They are usually not the tool you reach for when an exact formula, a standard sorting algorithm, dynamic programming, or a well-known graph algorithm already solves the problem cleanly. They are useful when the search space is too large to enumerate and the best thing you can do is score candidates, keep the promising ones, and keep searching.
Common use cases include:
- engineering and design optimization: tuning wing shapes, turbine blades, antennas, vehicle bodies, bridges, or other designs with many interacting variables and physical constraints. A classic example is NASA's evolved antenna work, where a GA produced an unusual bent-wire antenna geometry that performed well under mission constraints (Source: NASA's Evolved Antenna)
- combinatorial optimization and logistics: finding short delivery routes, scheduling warehouse robots, assigning pickup paths, or solving TSP-like problems where order matters
- financial modeling: searching for portfolio allocations or trading-rule combinations that maximize return under risk, drawdown, or compliance constraints
- scheduling and resource allocation: building university timetables, assigning airline crews, allocating machines in a factory, or balancing shifts when constraints collide
- machine learning and AI: tuning hyperparameters, selecting features, evolving neural network architectures, or using neuroevolution for game agents and simulated control problems
- bioinformatics: searching candidate protein structures, aligning DNA sequences, or exploring biological configurations where direct search is too expensive
Because GA is a general-purpose heuristic search method, it can be applied to a wide variety of problems. However, it is important to remember that GA is not a silver bullet. It may not be the best choice for every problem, and it may require careful tuning of parameters and operators to achieve good results. And a GA is not guaranteed to find the optimal solution, but it can often find good solutions in a reasonable amount of time.
TSP example ¶
The Traveling Salesman Problem (TSP) is a classic combinatorial optimization problem. The goal is to find the shortest possible route that visits a set of cities and returns to the starting point. TSP is NP-hard, which means that there is no known efficient algorithm to solve it exactly for large instances. This makes it a good candidate for heuristic search methods like genetic algorithms.
Setup ¶
The TSP example uses a fixed list of 12 cities with simple (x, y) coordinates. Each candidate solution is a complete tour that visits every city exactly once and returns to the start.
var cities = []city{
{name: "London", x: 2, y: 7},
{name: "Paris", x: 3.5, y: 5.5},
{name: "Amsterdam", x: 4.5, y: 7},
{name: "Berlin", x: 7, y: 6.5},
{name: "Munich", x: 6.5, y: 5},
{name: "Vienna", x: 8.5, y: 5},
{name: "Rome", x: 6.5, y: 3},
{name: "Madrid", x: 1.5, y: 3.5},
{name: "Barcelona", x: 3, y: 3},
{name: "Warsaw", x: 9.5, y: 6},
{name: "Stockholm", x: 8, y: 9},
{name: "Prague", x: 7.5, y: 5.5},
Population ¶
The population is a set of candidate routes. Each candidate is a permutation of city indices that represents a complete tour. The initial population is generated by shuffling a base route, which guarantees that every initial candidate is valid. In TSP, a valid route is one that includes every city exactly once. This constraint also affects the design of crossover and mutation operators, as we will see later.
func makeInitialPopulation() []candidate {
population := make([]candidate, populationSize)
baseRoute := make([]int, len(cities))
for index := range baseRoute {
baseRoute[index] = index
}
for index := range population {
route := slices.Clone(baseRoute)
rand.Shuffle(len(route), func(left, right int) {
route[left], route[right] = route[right], route[left]
})
population[index] = candidate{route: route}
}
return population
}
Genome ¶
The genome is an integer slice that represents the order of cities in the route. For example, a genome like [6 7 8 10 11 9 5 1 0 2 3 4] means that the route starts at city 6 (Rome), then goes to city 7 (Madrid), then city 8 (Barcelona), and so on. Every city must appear exactly once in the genome.
Fitness ¶
The fitness of a candidate route is based on its total distance. The distance of a route is the sum of the Euclidean distances between consecutive cities, including the final edge back to the start. The fitness is defined as the inverse of the distance, so shorter routes have higher fitness.
func evaluatePopulation(population []candidate) {
for index := range population {
distance := routeDistance(population[index].route)
population[index].distance = distance
population[index].fitness = 1 / distance
}
}
func routeDistance(route []int) float64 {
total := 0.0
for index := range route {
current := cities[route[index]]
next := cities[route[(index+1)%len(route)]]
total += distance(current, next)
}
return total
Selection ¶
Selecting the candidates for the next generation is first done by copying the best 8 candidates unchanged (elitism). The rest of the candidates are generated by selecting two parents via tournament selection, crossing them over, and mutating the child. The population size in this example is 250, so the next generation consists of 8 elite candidates and 242 children generated from crossover and mutation.
func nextGeneration(population []candidate) []candidate {
next := make([]candidate, 0, len(population))
for _, elite := range population[:eliteCount] {
next = append(next, candidate{route: slices.Clone(elite.route)})
}
for len(next) < len(population) {
parentA := tournamentSelect(population)
parentB := tournamentSelect(population)
childRoute := orderCrossover(parentA.route, parentB.route)
mutate(childRoute)
next = append(next, candidate{route: childRoute})
}
return next
}
func tournamentSelect(population []candidate) candidate {
best := population[rand.IntN(len(population))]
for draw := 1; draw < tournamentSize; draw++ {
challenger := population[rand.IntN(len(population))]
if challenger.fitness > best.fitness {
best = challenger
}
}
return best
}
Crossover ¶
The crossover operator for TSP is more complex than the one used in the string example because we need to ensure that the child route is valid (i.e., it contains every city exactly once). The program uses order crossover, which works as follows:
- copy a contiguous slice from the first parent into the child. The start and end of the slice is chosen randomly.
- walk through the second parent in order
- insert only the cities that are not already present
- wrap around until the child is full
func orderCrossover(parentA []int, parentB []int) []int {
child := make([]int, len(parentA))
for index := range child {
child[index] = -1
}
left := rand.IntN(len(parentA))
right := rand.IntN(len(parentA))
if left > right {
left, right = right, left
}
used := make([]bool, len(parentA))
for index := left; index <= right; index++ {
gene := parentA[index]
child[index] = gene
used[gene] = true
}
insertAt := (right + 1) % len(child)
for offset := range parentB {
gene := parentB[(right+1+offset)%len(parentB)]
if used[gene] {
continue
}
child[insertAt] = gene
used[gene] = true
insertAt = (insertAt + 1) % len(child)
}
return child
}
Mutation ¶
Mutation for this TSP example is implemented as a swap mutation, where two cities in the route are swapped with a certain mutation rate. This is a good fit for a permutation genome because a swap always produces another valid permutation.
func mutate(route []int) {
if rand.Float64() >= mutationRate {
return
}
left := rand.IntN(len(route))
right := rand.IntN(len(route))
for left == right {
right = rand.IntN(len(route))
}
route[left], route[right] = route[right], route[left]
}
Termination ¶
The termination condition in this example uses a plateau-based approach. The algorithm keeps track of the best distance found so far and counts how many generations have passed without a meaningful improvement (defined as an improvement greater than convergenceEpsilon). If the number of stagnant generations reaches convergenceWindow, the algorithm terminates, assuming that it has converged to a good solution.
func main() {
population := makeInitialPopulation()
evaluatePopulation(population)
sortPopulation(population)
startBest := population[0]
fmt.Printf("Initial best distance: %.2f\n", startBest.distance)
fmt.Printf("Initial route: %s\n\n", formatRoute(startBest.route))
bestDistance := startBest.distance
stagnantGenerations := 0
finalGeneration := 0
for generation := 1; ; generation++ {
population = nextGeneration(population)
evaluatePopulation(population)
sortPopulation(population)
finalGeneration = generation
best := population[0]
if best.distance < bestDistance-convergenceEpsilon {
bestDistance = best.distance
stagnantGenerations = 0
} else {
stagnantGenerations++
}
if generation%progressInterval == 0 || generation == 1 {
averageDistance := averageDistance(population)
fmt.Printf(
"Generation %3d | best distance %.2f | avg distance %.2f | stagnant %3d | route %s\n",
generation,
best.distance,
averageDistance,
stagnantGenerations,
formatRoute(best.route),
)
}
if stagnantGenerations >= convergenceWindow {
fmt.Printf(
"Converged after %d generations with no meaningful improvement for %d generations.\n",
generation,
stagnantGenerations,
)
break
}
}
best := population[0]
fmt.Println()
fmt.Printf("Total generations: %d\n", finalGeneration)
fmt.Printf("Final best distance: %.2f\n", best.distance)
fmt.Printf("Improvement: %.2f%%\n", percentImprovement(startBest.distance, best.distance))
fmt.Printf("Final route: %s\n", formatRoute(best.route))
if !isValidRoute(best.route) {
panic("best route is invalid")
}
Running the example ¶
You can run the TSP example with this command:
cd tsp
go run .
The algorithm converges quite quickly to a good solution. In my tests it runs for about 60 generations. The convergence window is set to 50, so the algorithm terminates when it detects that there has been no meaningful improvement for 50 generations. That means the algorithm has found a good solution in about 10 generations and then spends the next 50 generations confirming that it has converged.
Wrapping up ¶
We have seen how to implement a genetic algorithm in Go, starting with a simple string example and then moving to a more complex TSP example. We have covered the key components of a genetic algorithm: population, genome representation, fitness evaluation, selection, crossover, mutation, and termination.
In real-world applications, genetic algorithms can be a powerful tool for solving complex optimization problems where the search space is large and not well understood. However, they require careful tuning of parameters and operators to achieve good results, and they are not guaranteed to find the optimal solution. Nevertheless, if a good-enough solution is enough for the practical goal, a genetic algorithm is often a good choice.