What is the future of genetic algorithms
Reinforcement learning: genetic algorithm
Algorithms in the reinforcement learning category learn independently by trying to maximize rewards or minimize penalties. Behind this is the principle of trial and error, combined with an evaluation that rewards good (goal-oriented) behavior and punishes bad behavior patterns. A reward here means that these behavior patterns will be tried out more frequently in the future. In the event of a punishment, the behavior patterns used will in future be tried out less frequently.
- QUNDIS GmbH, Mannheim
- SFC Energy AG, Brunnthal near Munich
The algorithm runs through a large number of iterations in which it combines proven behavior patterns and randomly tries out new behavior patterns. In this way he comes closer to the optimum step by step. The best-known representatives of this category are the genetic algorithms, which are based on Charles Darwin's theory of evolution.
Reinforcement learning is used in minimize and maximize tasks. It is also used in learning processes in which one should react to changing environmental influences. For example, reinforcement learning could be used to teach a colony of robot ants how to get around optimally. Each robot ant would initially try to move forward with a random movement technique.
Success can be measured (fitness function): the distance covered. In the next generation, locomotion techniques that have been exceptionally successful will be combined with one another more often than average (recombination) and their characteristics will be inherited, which means that their characteristics will be used more frequently in the future. A generation is the totality of all individuals who are sexually compatible with one another for reproductive purposes in one step of the temporal chain of reproduction.
In addition, however, a new, random movement feature is always tried out with a certain probability (mutation rate). This corresponds to the mutation in evolution. At the end of each generation, the fitness function is evaluated again. As a result, the robot ants have become more and more successful in moving over the many generations.
Reinforcement learning has the advantage that learning also takes changing environmental factors into account. If the terrain changes every now and then, for example because it rains and the ground becomes muddy, the evolution of movement techniques will take this into account. That is why life on earth was able to continue despite ice ages and dry periods: It has adapted to the new environmental influences.
The three most important principles in genetic algorithms are the terms recombination, mutation and selection. Recombination stands for the accidental mixing of 50 percent of each parent's genetic make-up during sexual reproduction and its transmission to the child. In genetic algorithms, recombination is the mixture of properties of the parent generation when they are passed on to the child generation. In genetic algorithms, mutations are random changes in the properties of individual individuals.
Selection means that individuals with the better genes have a higher chance of living long and having many offspring. The selection is driven by external pressure: predators in nature, food shortages, epidemics, climatic challenges, etc. In the case of genetic algorithms, selection usually takes place through a mathematical evaluation function: the so-called fitness function. This function awards points (score) that evaluate the achievement of goals. Another possibility is that the function calculates the costs and the goal is to minimize the costs. Costs can be very different here: distances, monetary costs, fuel consumption, failure probability of components, etc.
How the algorithm works is shown in Figure 4.
The execution of the algorithm ends either after reaching the target or exceeding a target number of points, which can be calculated with a mathematical evaluation function, or after a number of generations specified by the user.
Due to the way it works, a genetic algorithm has two important properties: It does not guarantee an optimal result, but usually only an improvement from generation to generation. In exceptional cases it can even happen that a subsequent generation deteriorates. This often happens due to a too high mutation rate.
In addition, the end result achieved is not the only possible one, nor is it always repeatable. If you run the algorithm a second time with the same initial population, the result is often different than the first time. This is because recombination and mutation are influenced by chance. A mental game: if one were to clone the earth from 5 billion years ago and let evolution take place on both earths, if the evolution theory of Charles Darwin really describes the development of life on earth correctly and completely, it would be very likely that on the second earth, not humans, but another, perhaps more or less intelligent form of life would arise. It is therefore often advisable to repeat the reinforcement learning with the genetic algorithm a few times and to compare the results.
- Does a violin bow make a difference
- What are the legal dangers of crowdfunding
- Creatine monohydrate causes acne
- Which major is better journalism or communication
- What is the purpose of the alternative school
- How do I overcome the loss of friends
- How is Freudian psychotherapy currently viewed?
- Why does air expand when heated
- How important is sunrise for life
- What do peregrine falcons eat
- Why does a narcissist want attention
- What do you mean by prokaryotic cells
- What makes a post viral
- How do you avoid fraud with advance fees
- Why are saunas made of wood
- Why is Urban Decay so good
- Can you live on Pluto?
- Can train a 15 year old
- Rutgers is a state level 1 school
- You can send audio files through Wechat
- Where is the department office in Patna?
- Were the Coptic Christians
- How do protons attract?
- How good is a Redmi 4A