Colouring Outside the Lines

Published by Abhinav Bana on

Learning to take better decisions from evolution

The first few things that comes to our minds when we think about the word ‘random’ are ‘unordered’, ’chaos’, ‘uncontrolled’, ‘non – deterministic’, even ‘non-linear’, or as people who love mathematic, statistics and probabilities call it – ‘stochastic’. 

Randomness is all around us –  from the random motion of particles suspended in a fluid (Brownian motion), to the movement of stock prices, to the outcome of throwing a dice or flipping a coin. Random changes in every living creature’s genetic code over millions of years have led to the stunning diversity of life on this planet. Humans have called this randomness as ‘chance’ or ‘luck’ since early days. But over the last few decades, our understanding of this ‘randomness’ and especially how we can define it, understand it and then use it as a tool to solve complex problems has grown exponentially.

In statistics, we deal with probabilities and probability distributions as a way to model real world situations. These distributions are a mathematical description of the random phenomena in terms of its ‘sample space’ and likelihood of occurrence of that phenomena. For example, a single coin toss can be thought of as a random event, with each outcome equally likely (50% chance of heads or tails). In the first few flips, the series of outcomes will look random (less heads, more tails e.g.). But we now know that this model of coin tossing fits a Binomial Probability Distribution, which gives us a specific model for answering questions about seemingly random outcomes. One such question could be.. if you increase the number of flips to let’s say 10, what is the chance of getting exactly 4 heads (you can check it should be 0.2, or 20%). So, we have modeled this randomness of a single coin toss outcome to our advantage and can now predict specific outcomes ! We can take it even further, and can also judge a series of coin tosses  – if they do not fit this Binomial probability distribution then we can definitely say that the coin is ‘biased’.

Understanding this ‘randomness’ in nature has led to major breakthroughs in science, especially development of ‘stochastic’ algorithms and optimisation engines for solving non-linear problems. Nowadays, the use of ‘random number generators’ is a basic building block of many algorithms, ‘random sampling’ is used for pre-election polls, in manufacturing to identify error rates and in most major industries. And as part of our own learning, it is always more fun to color outside the lines and see what else is hidden in the canvas of life.

Genetic Algorithms: Thriving on randomness

One of the most notable uses of utilising ‘randomness’ in nature to solve complex problems has been the development of a family of algorithms that thrive on randomness. One of such algorithms which mimics our own physical world is Genetic Algorithm (GA). At the very core of this algorithm is the mechanism of ‘evolution’. For those new to this, Genetic Algorithms model how life evolves to fulfill a given purpose i.e to increase probability of survival of a species (or as we ‘code’ it: to minimise/maximise a given objective function). It does this by mirroring the stages in natural evolution. 

To understand the role of Genetic Algorithms in helping us make better decisions in the real world, let’s set up a guiding example. 

Imagine a Food delivery company trying to allocate vehicles and drivers to delivery routes in order to minimise total delivery costs. 

We can consider its objective function to be dependent on following variables (not an exhaustive list, but good for demonstrating):

  • v: list of vehicles and their running cost per hr
  • d: list of drivers and their labor cost per hour
  • r: list of drivers
  • c: routes and their driving time in hours 
  • a: A specific allocation of vehicles and drivers to a route (vehicle driver route allocation) 

or in scientific terms:

total delivery cost = function of (v, d, r, c, a)

It may also be helpful to visualize the lineage of calculations involved using a DAG as shown below:

graph TD
  
  labor_cost_per_hour --> |applies_to| drivers 
  driving_time_in_hours --> |applies_to| routes
running_cost_per_hr --> |applies_to| vehicles 

  drivers --> |allocated_to| vehicle_driver_route_allocation

  routes --> |allocated_to| vehicle_driver_route_allocation
  vehicles --> |allocated_to| vehicle_driver_route_allocation

  vehicle_driver_route_allocation --> |estimates| total_running_cost
  running_cost_per_hr --> |estimates| total_running_cost
  driving_time_in_hours --> |estimates| total_running_cost

  vehicle_driver_route_allocation --> |estimates| total_labor_cost
  labor_cost_per_hour --> |estimates| total_labor_cost
  driving_time_in_hours --> |estimates| total_labor_cost

  total_labor_cost --> |estimates| total_delivery_cost
  total_running_cost --> |estimates| total_delivery_cost

Each of the above variables will have a bound/constraint on the values it can take. We can, of course, try and run all permutations and combinations using brute force approach (for our simple example, with lets’ say 3 drivers, 4 vehicles, 10 routes: there are  10.4.3 = 120 combinations. Note that the total combinations expand by O(n3 ). So, as the number of routes, or drivers, or vehicles increase, the total combinations increase exponentially, rendering brute force approach rather impractical. Another alternative is that we can let nature do the heavy lifting of coming to a solution

  1. Creation of initial ‘population’: where population is a collection of potential allocation options for our given problem (i.e specific allocation options “a” or in the DAG above, vehicle driver route allocation) – uniformly sampled from the range of possible allocation options. 
  2. Evaluate Fitness: Evaluate total delivery cost for each set of allocation option from the ‘population’, call it their ‘fitness’. 
  3. Parent Selection: For evolution to work as per ‘survival of the fittest’, best fit parents need to be selected and put into pairs. There are multiple methods for parent selection for GA’s (Fitness proportionate selection, Roulette Wheel selection, Tournament Selection, Stochastic Universal Sampling, Rank-based, or plain Random sampling). Each selection method has its own advantages, but a good starting point is to use Tournament Selection as it has been shown to:
    • Be scalable due to to support for parallel compute architectures and
    • Is easier to implement and iterate upon
  4. Crossover: Once Parents are selected, they are ‘mated’ in pairs to create a pair of ‘offsprings’. There are again various methods to create offspring. One such method is 1-point crossover – where offspring inherit half the genetic material of each of their parents. 2-point crossover,  Uniform  crossover, multi-point crossover are some other methods for generating offspring. The most common approach is to use 1-point or 2-point crossover techniques, as these tend to emulate evolution in nature.
  5. Mutation: This is a key step in GA. In this step, we introduce ‘randomness’ into offspring, by randomly ‘mutating’ or changing a certain small fraction of the offspring population. By doing this, we try to mimic how nature evolves and introduces diversity as well as unique traits and capabilities in living creatures to help them adapt and survive. If our solutions are clustering towards a local maxima/minima, mutation nudges the algorithm to explore a wider range of solutions by creating unique solutions (for our Food Delivery firm, this could translate to a unique combination of driver/route/vehicle – which would otherwise go unnoticed. E.g. maybe if one driver serviced two different routes rather than always having a preferred roster, or upskilling drivers to drive not just bikes, but also cars?)
  6. Next Generation: This step determines which ‘individuals’ from the parent and the offspring populations are to be selected as candidates for the next generation. There are a whole host of methods to select candidates, main ones among those are: Elitism (best fit individuals), age-based selection etc. Care needs to be taken to maintain the diversity of the population (like in the real world).
  7. Exit Condition: If we have reached an acceptable threshold for minimising (or maximising) the objective function, then we stop, else we repeat steps 2-6 by replacing the population with the next generation which was selected in step 6.

The algorithm can be visualized as a sequence diagram below:

sequenceDiagram
    participant User
    participant GeneticAlgorithm
    participant Population
    participant FitnessEvaluation
    participant Selection
    participant Crossover
    participant Mutation
    participant Termination

    User->>GeneticAlgorithm: Initialize parameters and population
    loop Generations
        GeneticAlgorithm->>Population: Evaluate fitness of individuals
        Population->>FitnessEvaluation: Calculate fitness values
        Population->>GeneticAlgorithm: Return fitness values
        GeneticAlgorithm->>Population: Select individuals for reproduction
        Population->>Selection: Apply selection methods 
        Population->>GeneticAlgorithm: Selected individuals
        GeneticAlgorithm->>Population: Perform crossover and mutation
        Population->>Crossover: Apply crossover operation
        Population->>Mutation: Apply mutation operation
        Crossover->>Population: Offspring from Crossover
        Mutation->>Population: Offspring from Mutation
        Population->>GeneticAlgorithm: New population with offspring
        GeneticAlgorithm->>Population: Check termination condition
        Population->>Termination: Termination condition reached?
    end

Solving Business Problems: Minimising Overall Delivery costs

Coming back to our Food Delivery business, let’s try and model our problem, and then use our own implementation of GA to find an optimal allocation of driver/vehicle for each route

Let’s assume below gives the hourly cost of each of our 3 drivers, and the hourly running cost of our 4 vehicles:

Driver Namedriver_1driver_2driver_3
Labor Cost Per Hour151713

Vehicle Namevehicle_1vehicle_2vehicle_3vehicle_4
Running Cost Per Hour26242527

Let us assume that route driving times are samples from a normal distribution with mean 2 and sd 1. A sample would look like this: 

Route NameDriving Time In Hours
route_12.5
route_21.9
route_32.6
route_43.5
route_51.8
route_61.8
route_73.6
route_82.8
route_91.5
route_102.5

In absence of any constraints, we would allocate the driver with lowest cost and vehicle with lowest cost to ALL the routes. That’s a trivial, but impractical solution since it overloads one resource. Hence we need to add some realistic constraints to our problem:

  • A driver can do at most 4 runs
  • A vehicle can do at most 3 runs

This ensures we are not overloading one driver or one vehicle. 

Our objective function simply calculates the total cost across all allocations.

Initial Population

To start off, we create a fixed number of random allocations of drivers and vehicles to routes, one of those is given below:

[15, 13, 13, 13, 13, 15, 15, 13, 13, 17,  24, 25, 25, 27, 24, 27, 26, 27, 25, 24]

In the above example, the leading 10 elements represent driver allocation options (represented by the costs instead of the names) to the 10 routes, while the trailing 10 elements represent vehicle allocation options to the same 10 routes.

Since there are 3 drivers and 4 vehicles, we see 3 unique values showing up in the front half, but 4 unique values showing up in the back half.

Fitness Evaluation

For each route we can evaluate fitness based on our equation for total delivery cost total delivery cost.

In our above example, route_1 has fitness (i.e total delivery cost) of (15+24)*2.5 = 97.5. Overall population fitness is calculated as the total delivery cost of all routes.

Selection

In order for us to create the ‘next’ generation from the current ‘population’, we have to select pairs of ‘parents’ that have the highest ‘fitness’ for the problem we are trying to solve (in our case, low total delivery cost). There are various methods of selection, in our case we have used the Tournament Selection method to identify pairs for the next step.

Crossover

We take 2 of these allocations (selected based on their ‘fitness’, from the Selection step)..  and we create 2 new ‘children’ by swapping over a part of each allocation choice

In the above “crossover” example (also called 1-point crossover) we are swapping over a subset of the driver allocation choices in the 2 parents. This crossover operation results in the birth of 2 new children, i.e exploration of 2 new allocations:

Mutation

From all the ‘children’ of the entire population, we chose a fixed number of ‘children’ that undergo mutation i.e. we will randomly replace one value with other possible values e.g, one of our child allocation above can be mutated as below:

After this step, we replace the parent with the child population and re-start with calculating the fitness of all ‘child’ allocations, then selection, crossover…. We keep doing this until the solution converges within a given limit (there are no more changes in ‘best’ fitness across multiple generations). E.g., in our case the total delivery cost seems to not change beyond 960.2 after few iterations, in which case we can safely assume the algorithm has converged

Running Genetic Algorithms based optimisation in the above setup, leads to the following “optimal” allocation:

Driving Time (hrs)Driver NameDriver_cost ($/hr)Vehicle NameVehicle_cost ($/hr)Total cost of run ($)
2.5driver_313vehicle_22492.5
1.9driver_115vehicle_12677.9
2.6driver_115vehicle_325104
3.5driver_213vehicle_224129.5
1.8driver_217vehicle_12677.4
1.8driver_115vehicle_12673.8
3.6driver_313vehicle_224133.2
2.8driver_313vehicle_325106.4
1.5driver_217vehicle_42766
2.5driver_115vehicle_325100
Total Delivery cost across all routes ($)960.2

It is worth noting, that in certain cases, simple rule based allocation choices may arrive at similar allocations. For example, if we use the following rules:

  • Allocate lowest cost drivers and vehicles to longest routes first
  • Do this for next longest routes and so on till we reach a limit on driver or vehicle, then we start allocating the next lowest cost driver or vehicle, and so on

Following this heuristic, we get following allocation:

Route_cost(hrs)Driver Cost ($/hr)Driver NameVehicle cost ($/hr)Vehicle NameTotal cost of run ($)
2.515driver_125vehicle_3100
1.915driver_126vehicle_177.9
2.613driver_325vehicle_398.8
3.513driver_324vehicle_2129.5
1.815driver_126vehicle_173.8
1.817driver_226vehicle_177.4
3.613driver_324vehicle_2133.2
2.813driver_324vehicle_2103.6
1.517driver_227vehicle_466
2.515driver_125vehicle_3100
Total Delivery cost across all routes ($)960.2

As we can see, both approaches lead to the same total minimum delivery cost, but GA has explored many counterfactual choices and provided a nice alternative. In other words, it has helped us avoid FUD (Fear, Uncertainty, Doubt) that using a simple allocation method does not make sense, in this particular case. 

However as we make this simple setup more complicated and realistic, our ability to keep coming up with simple heuristics becomes challenging. Genetic Algorithms give us a handy apparatus to model all of this complexity, without having to be burdened with reasoning about simple heuristics, as it “explores” and “exploits” allocation choices which can be evaluated based on their evolutionary fitness.

As we extrapolate our model to include complex delivery processes across transport, logistics, supply chain, then the optimal resource allocation and job allocation/scheduling in manufacturing, networks, everyday projects (construction, IT etc.) can be tackled using this versatile tool. 

GA’s have also found uses in biosciences, healthcare, agriculture, gaming, insurance and virtually every field of study. GA’s are also used to perform hyper-parameter search for machine-learning algorithms and play an essential part in AI/ML development. Genetic algorithms are  very well suited to multiparameter and highly nonlinear problems

Key points to note about the GA’s and stochastic algorithms in particular:

  • They are sensitive to the start points – since these are randomly generated, it could lead to different outputs over different runs (while still aiming to minimise/ maximise the given objective function).
  • They sometimes get stuck in a local maxima… and tuning one of its hyper-parameters, like ‘mutation’, usually helps
  • GA’s usually get very close to the optimal solution, but don’t really converge very well. As such, these are magnificent tools to exploit when we want an approximate solution to some really hard problems. GA’s can quickly come close to a ‘good-enough’ solution by flipping random switches. GA’s are preferred for optimization problems that are not well suited for standard optimisation algorithms e.g. problems with stochastic, highly non linear, discontinuous (mixed integer programming) objective functions. 

Summary

Genetic Algorithms are a versatile and powerful tool which can help solve simple to complex business problems. Since they mimic natural evolution, they can comb through very large search spaces and come to an efficient solution, letting decision makers rest at ease that they have done an exhaustive search of decision making options. 

Be it product development, optimising business processes, efficiently designing a network, or finding optimal allocation of your resources, there is almost always a possibility GA’s can be useful.

Most importantly GA’s offer us a scientific method to “Color Outside The Lines” and explore possibilities which may be hidden in our blindspots.