Introduction

Evolutionary algorithms are a family of optimization algorithms based on the idea of biological evolution. The algorithm simulates the phenomenon that biological entities take generations to adapt to the environment. It is like a refining process that will help to converge to a (near-)optimal solution. For machine learning, evolutionary algorithms can be used for feature selection, hyperparameter tuning, and the evolution of a neural network architecture.

The typical steps of an evolutionary process includes: initialization, evaluation, selection, crossover, mutation, and replacement. For initialization, an initial population of agents is created, each adopts a candidate solution. A candidate solution is a possible strategy or possible solution for the problem, such as a set of model parameters, a set of selected features, or a network architecture. In the second step of evaluation, we evaluate the fitness of each agent if the population. The fitness score can be calculated depending on the purpose, it could be the performance of the model on a validation set. The selection process then happens in the style of survival of the fittest. Strategies with higher fitness have higher probability of being selected for reproduction. Then in the crossover step, offsprings are created by combining parent strategies, including averaging the weights of two neural networks and combining substructures of different networks. An important step is to add mutation to the selection pool. Some offspring will be chosen randomly to modify their traits. It could involve perturbing the weights of a neural network or randomly modify part of its architecture. After reproducing the offsprings, we replace some of the parent population with the new offsprings. The stopping condition is a number of generations or a certain level of performance.

The evolutionary algorithm is simple to understand, can explore a substantial part of the searching space, and can escape local optima. But for large problems, it can be computationally costly, requiring skillfull engineering of the tuning process.

Example

An example for the use of evolutionary algorithm is in automated antenna design, for projects of NASA Space Technology mission. Since the hand design practice requires domain knowledge and lots of work, researchers have been considering evolutionary design. The field grows with computer speed and ability to do electromagnetics simulators. For two NASA missions ST5 and TDRS-C, it takes three-four months to evovle the desired and operational antenna design. Usually it takes 5 months to design antenna manually. And there were thousands more design variations explored than with manual labor. The evolved designs are also more efficient, better data throughput, less power.

Screenshot 2023-06-22 at 14 33 07

Image: evolved antenna for ST5

Screenshot 2023-06-22 at 14 33 24

Image: evolved antenna for TDRS-C

Another example is the use of evolutionary algorithm in evolving neural network architecture in image classifier task. It evolves neural network architecture that surpass human design. The model is named AmoebaNet-A. It also achieves state of the art results on ImageNet, matched against models crafted by complex search methods. When matched against a RL algorithm, the evolutionary method gets results faster with the same hardware. In short, evolutionary method is a simple method that can produce high quality architectures for the architecture search task.

Before anything else, note that there are a whole range of different methods to search for architectures: cascade-correlation, boosting, hill-climbing, random search, grid search, etc. There are additional techniques to fine tune those methods, too, such as progressive complexity search stages, hypernets, accuracy prediction, warm starting and ensembling, parallelization, reward shaping and early stopping.

Screenshot 2023-06-22 at 15 55 01

AutoML-Zero

AutoML is a field that automates the discovery of complete machine learning algorithms using basic mathematical operations as building blocks. AutoML-Zero is a model that can evolve machine learning algorithms from scratch. Modern techniques emerge such as bilinear interaction, normalized gradient, and weight average. Usually, the search for architecture starts with expert-designed layers as building blocks, or isolate algorithmic aspects of the model. Those are still restrictive and sometimes doesn’t save time and effort. AutoML-Zero aims to address this by propose searching an entire ML algorithms with little restriction on form, using only simple mathematical operations as building blocks. This allows for fine grained search of the model, optimization procedure, initialization, and even non-neural network algorithms. The generated ML algorithm is considered a computer program with three components: Setup, Predict, and Learn, that perform initialization, prediction and learning. The parameters are on operation and memory addresses for each instruction. This reduces expert design, but make the space sparse and huge.

Specifically, the algorithms are represented as computer programs having virtual memory and separate address spaces for scalar, vector and matrix variables. Programs are sequences of instructions. Each instructions has an operation: read inputs from scalar address 0 and vector address 3, multiply, write the output to vector address 2. The algorithm (including setup, predict, learn) is evaluated on training and validation set.

Screenshot 2023-06-22 at 17 46 39

The evolution is then done in cycles. In the first cycle, the population has P algorithms, all empty. The algorithms are then improved through cycles. In each cycle, T < P algorithms are picked at random and the best performer is selected as the parent (this is called tournament selection). The parent is then copied and mutated to produce a child algorithm, while the oldest algorithm would be removed. The mutation has three cases: insert or remove at random an instrcution at a random location in a component, randomize all the instructions in a component, replace one of the arguments of an instruction with a random choice (for example, swap the output address, or change the value of a constant). The evolution is distributed across worker processes. Each process has its own P-sized population. Periodically, workers exchange randomly selected algorithm (migration).

It is found that evolutionary method is five times better than a random search. For complex task, random search no longer able to find a solution. But evolution method thrives, it doesn’t just discover the forward pass of a neural network, it also invents back propagation to learn the weights. The task is binary classification task on MNIST and CIFAR. After the experiments, it is proven that the framework can discover commonly used algorithms from scratch. Note that only basic high school mathematical concepts were used, complex machine learning concepts were purposely left out.

Conclusion

In conclusion, evolutionary algorithms bring a fascinating dimension to the field of machine learning. Inspired by biological evolution, these algorithms use principles such as mutation, crossover, and natural selection to explore and search within a problem space. This exploration allows for the creation of potentially innovative solutions that might not be discovered using traditional optimization methods.

In the context of machine learning, evolutionary algorithms have proved to be an effective strategy in various scenarios, ranging from hardware designing to neural network architecture design, and even to crafting entirely new machine learning algorithms as exemplified by AutoML-Zero.

Despite the challenges, the potential of evolutionary algorithms in advancing machine learning is immense. As computational resources continue to improve and more efficient implementations of evolutionary algorithms are developed, we can expect to see further innovative applications and results from this field. Therefore, understanding and incorporating evolutionary algorithms into the toolbox of machine learning techniques will undoubtedly prove invaluable for future breakthroughs in the field.