Training an AI until it beats me in Trackmania - Summary

Summary

The speaker describes a three-year journey to create an Artificial Intelligence (AI) that can beat them in the racing game Trackmania. The AI is designed to improve over time through trial and error, using an artificial neural network and a method called Reinforcement Learning. The AI starts with zero prior knowledge and its decisions are initially random. It is rewarded for every action it takes, depending on how good that action was. The faster the AI progresses along the track, the higher the reward. Over time, the AI learns to drive the game better, all by itself.

The speaker tried to build such an AI several times before, but it took another attempt six months ago to achieve success. The AI's first decent attempts on a track were not as successful as expected, often getting stuck in a sub-optimal strategy. The AI also had trouble with braking and drifting. After several adjustments in the code, the AI started to improve and eventually, it managed to beat the speaker on a challenging map.

The speaker then tested the AI on another map where it had never trained before. The AI was less precise and made more mistakes on unseen tracks. The speaker then decided to retrain the AI with the brake available. After several more hours of training, the AI managed to drift more wisely and destroyed its previous record on the endurance map.

In the final test, the AI outpaced the speaker, proving that it was more precise, consistent, and faster. However, the speaker noted that the AI was not truly unbeatable and that many better players could easily drive a faster time. The speaker concluded that the AI has come a long way but there's still a lot to learn and improve.

Facts

1. The AI in the racing game Trackmania is controlled by an Artificial Intelligence (AI) designed to improve over time through trial and error.
2. The AI uses an artificial neural network, a mathematical tool which models how a brain works.
3. The AI receives a few numbers every tenth of a second describing what's happening in the game and outputs new numbers specifying the action to perform.
4. The AI's performance is configured using a method called Reinforcement Learning.
5. The AI starts from scratch with zero prior knowledge and its decisions are initially quite random.
6. The AI is rewarded for every action it takes, the faster the AI progresses along the track the higher the reward.
7. The AI learns the game by exploring it and gathering data from it, which is used to progressively tweak the neural network.
8. The AI's performance improves over time as it learns from its mistakes and successes.
9. The AI was trained over a period of three years to beat the human player in the game.
10. The AI was initially unable to break, which limited its performance.
11. The AI was retrained with the brake enabled, which improved its speed and allowed it to drift, saving a few tenths of a second.
12. The AI was trained to master the neo-drift, a technique that triggers a drift even at relatively low speeds.
13. The AI was able to drive more wisely and consistently after learning to drift.
14. The AI was able to beat the human player in the final test, proving that it could not be beaten anymore.