My home city of Toronto, as the old joke goes, has two seasons: winter and road construction.
There is a lot of truth to that. As a runner, cyclist and driver I can’t help but notice that from April to October, it’s virtually impossible to navigate between any two points in the city without encountering a detour or lane restriction. Moreover, in the southeast, the roads and highways are prone to flooding where the Don River, which cuts through the city and flows into Lake Ontario, occasionally overflows its banks when melting snow or spring rainfall is particularly heavy. Another problem is that older Toronto houses like mine can be susceptible to wet basements due to the aging storm sewers backing up, causing the municipal government to institute a downspout disconnection bylaw—our eavestrough downspouts must not drain into the storm sewer system; rather, they are supposed to disperse rainwater over the ground surface.
A few weeks ago, my brother Klaas Kraay, the Toronto Metropolitan University philosopher, forwarded me an e-mail from a colleague, Xian-Xun (Arnold) Yuan, a professor in the school’s department of Civil Engineering. The point of the e-mail was that Dr. Yuan had just run a paper through Google’s new Generative AI tool, NotebookLM, and was very excited about the results. NotebookLM generates a summary of any document you feed it, which sounds pretty conventional, but its trick is to produce its output as audio, in the form of a dialogue between two characters.
Upon first hearing this “virtual podcast” summary, I was impressed enough to think that it might be material for the Turing Test. But when my brothers and I started playing with it further, feeding it more papers, its “tells” soon became obvious—repeated turns of phrase and other occasionally clumsy linguistic cues that gave it away as not being human. Plus, if you understand the source document well enough, you can quickly catch errors, misconceptions and omissions in NotebookLM’s summary[1].
What Klaas didn’t expect was that, NotebookLM aside, I would actually be interested in the content of his colleague’s paper. Dr. Yuan and his coauthor, Amir Keshvari Fard, wrote about infrastructure asset management and the difficult problem of optimizing multiyear maintenance and rehabilitation (MMR) projects for roads, bridges, and water systems. With my own academic background in optimization and living the urban frustrations I just described, I began reading the paper more carefully. Particularly noteworthy to me was the authors’ description of a computational optimization technique that I wasn’t deeply familiar with—the genetic algorithm.
Genetic algorithms are usually not classified as AI, although there is some overlap. They’re part of a field of computer science known as evolutionary algorithms—and are useful for solving large-scale optimization and search problems. There is a random element to how they start off, similar to Monte Carlo simulations, but, because they execute iteratively, improving their approximate solutions over many generations, I think they have much in common with AI via Machine Learning and Reinforcement Learning.
So, how do genetic algorithms work—and is there also a connection to quantum computing?
It’s all in the DNA
Biology was never my strong suit, so I first had to brush up on the topic of genetics. Genes are made up of DNA, which, in its well-known double helix structure[2] in humans, is composed of two sets of 23 chromosomes. Within the DNA, the basic units of genetic information are encoded from four chemical bases: Adenine, Guanine, Cytosine and Thymine, usually abbreviated as A, G, C and T. Billions of pairs of these bases ultimately define what makes you, you.
During reproduction, each parent contributes one set of chromosomes to their offspring, and how those chromosomes and their bases pair up determines the traits that children inherit from their forebears. Mutations may be introduced due to errors in the DNA replication or environmental factors like chemicals or radiation, leading, of course, to evolution by natural selection—survival of the fittest.
Genetic algorithms date back to 1975, and the premise is simply to borrow all this biological language, mapping it to data structures in computer science. A data structure is nothing more than a collection of variables, each of which has a data type and range of values. For example, a data structure describing a person might have variables for hair colour, eye colour, skin tone, height, weight, age, etc. The data types could be hues, centimeters, kilograms and years, and you can see how those variables would have different possible values. In genetic algorithms, we call the data structure a chromosome and the variables genes. Chromosomes are also often referred to as individuals, and multiple individuals form a population—or, more accurately, a generation. It’s not a perfect mapping to human genetics, but it will do.
For the genetic algorithm to work, the programmer must create an initial population, or first generation, of individuals. This is done stochastically—after deciding how many individuals are needed for a representative sample, random values are selected for each variable in all the data structures. The population size, or number of individuals, depends on the complexity of the data structure—you need to get a good selection from all the possible values for all the genes in a chromosome. You can easily have hundreds or thousands of individuals, and the population size can increase quickly with the number of variables.
Darwinian logic
Optimization, if you recall from one of my earlier posts, is all about finding better and better solutions to something called an objective function. The objective function is a mathematical representation of the problem you want to solve—like finding the most efficient way to load cargo in an airplane or container ship, or maximizing the geographical coverage of cellular service with the fewest possible towers. The objective function is evaluated on each of the individuals in the population, and the result is that individual’s fitness value.
Here's where it gets interesting. Individuals in the population are selected and paired up for breeding and this is done pseudo-randomly, with a bias toward individuals with better fitness values. But unlike biological parents, who each contribute a set of chromosomes complete with their A, G, C and T base pairs to their offspring, the genetic algorithm implements something called a crossover function. There’s a very good Wikipedia article describing how crossovers work, but the basic idea is that the algorithm is randomly selecting variables from each parent’s chromosome, or data structure, and putting them into the child’s chromosome.
Remember that in classical computing, everything maps to ones and zeros, so you can think of the chromosome data structures as just long strings of bits—and these are of equal length between the two parents, since only the values of the variables differ but the data structures and variable types are identical. The algorithm picks random crossover points in the bit string and selects the data from one parent or another between those crossover points, contributing the selections to a new, complete bit string for the child. Multiple offspring can be generated this way, depending on how many crossover points are chosen. There are many different mathematical techniques for making the selections and executing the crossover, all with the intent to ensure that the child’s data structure is complete and coherent. Developers can use whichever technique works best based on the unique characteristics of the problem they’re solving and the nature of the data structure they’re working with.
Of course, as the genetic algorithm breeds new offspring from a generation of data structures, it can also introduce mutation. This is done by randomly changing the values of some of the variables during the crossover, potentially introducing new characteristics to the offspring data structures.
Once the breeding process is finished, the result is the next generation, or population, of individuals. Now, the objective function is executed on these new data structures and, once again, the individuals are evaluated for fitness, paired up, and another round of breeding with crossovers and mutations starts. The cycle is repeated as many times as necessary—until a satisfactory solution is reached, or until a fixed number of iterations have been executed, or if the algorithm reaches a plateau and stops producing better results, for example.
Sex and the city
Genetic algorithms are a popular approach to solving many complex optimization problems, and a useful alternative to other optimization techniques such as gradient descent[3]. Fard and Yuan, in their paper, point out that optimization of municipal infrastructure multiyear maintenance and repair (MMR) is exactly the kind of problem that suffers from combinatorial explosion, or the curse of dimensions—an exponential increase in complexity when you consider multiple years and add more miles of road to be paved, or sewers to be inspected and repaired. So, the solutions generated, although not bad, are far from perfect, and the practical result is, in part, the traffic congestion I deal with as a commuter.
The authors’ insight is to improve the crossover and mutation functions. In traditional genetic algorithms, the quality of the data involved in crossover and mutation isn’t always considered, sometimes or even often yielding infeasible offspring and therefore making the algorithm less efficient. Without going into details of all the linear algebra involved in their innovation, they decided to work at the level of the annual plan—a mathematical description of everything that needs to be done for, say, a certain road segment or sewer pipe section over the course of a year, including a budget allocated for the work. Their crossover function swapped annual plans and introduced constraints to mutation so as not to exceed budget limits.
Fard and Yuan achieved some impressive results. While conventional genetic algorithms were still chugging away after 11 hours of computation and 1,000 generations, their modified algorithm achieved multiple feasible local minima, or solutions to the objective function, after only 20 generations and a few seconds of computing time. I can only hope that Toronto’s City Council notices these results and I would like to look forward to shorter commute times and less flooding—but I won’t hold my breath for our municipal politicians to follow the mathematics.
But nature is quantum
I’ve quoted Richard Feynman before: “Nature isn’t classical, dammit!” So, I was very interested to see that there was a presentation at IBM’s TechXchange conference in Las Vegas last October, on the topic of quantum genetic algorithms. It was delivered by a team of IBMers and consultants working on the analysis of seismic data for oil and gas exploration. Using the data to try to map out properties of geological features, especially offshore, is extremely complicated—so much so that exact solutions are impractical, and the team used optimization algorithms to find approximate solutions instead.
Having worked with classical genetic algorithms, they decided to experiment with pure quantum and hybrid quantum-classical genetic algorithms. In a quantum genetic algorithm, all the data were represented as qubits and the objective function was implemented as a quantum circuit. In the hybrid or “dual-unit” approach, only some of the data were implemented as qubits and the objective function was also hybrid, with parts as classical and parts as quantum circuits. The team used IBM’s go-to quantum workhorse, the System 1 “Eagle” with 127 qubits.
The results showed by far the best improvement in optimization came from the hybrid approach. The pure quantum genetic algorithm performed about twice as well as the classical, but the hybrid algorithm yielded a surprising 170 times improvement in the fitness of the offspring, after only 100 generations. I suppose it goes to show that quantum computing, while very good at certain mathematical operations, often provides little to no advantage for others—so you have to choose very carefully where you will deploy it.
It's interesting to see how biology can provide a useful contribution to computer science in the form of genetic algorithms—and their use cases extend far beyond the two that I just recently happened to stumble across. It seems to me that they sit in a computational sweet spot at the intersection of mathematics, AI and quantum.
Mix all three together in just the right way, and evolution will become revolution.
[1] Which begs the question, why bother using AI document summaries at all? If you know the original well enough to be able to catch the errors, you don’t need the summary. And if you don’t know enough to catch the errors, you can’t trust the summary.
[2] The double helix was first discovered by biophysicist Rosalind Franklin working at King’s College, London in 1953. She died at the very young age of 37 and because Nobel prizes are not awarded posthumously, James Watson and Francis Crick get all the glory. Shamefully, they did not give Franklin the credit she richly deserved.
[3] Imagine a graph of the objective function, and then walking backward or forward along the graph. You can choose the size of step you take and the direction you go, and hope that you will always go downhill toward a good local minimum. If your steps are too short, you might never get there, but if your steps are too big, you might miss the local minimum and find yourself going back uphill again.
I got lost a little as how this will be useful for fixing Toronto's infrastructure problems but I have great faith in your amazing intellectual abilities in AI to make it happen.