Below you can see a simple game: Ten lady beetle on a racetrack 🏎. Each of the ladybugs starts with 1000 points. It crawls from the top left and has to move to the finish on the bottom left. If it finishes, it wins 5 points and is placed again at the start line. If it runs into a wall, it loses 1 point. The small numbers next to the icon show the current score. The best beetle is marked with a star.
At the beginning they are plain stupid: They run in a circle or hit the walls 🤕. After about 15 minutes (depending on your computer's speed) it learns what to do thanks to the set incentives. It will try to avoid running into walls (because this causes a penalty of one point). And it will try to reach the finish line. The surprising thing is that I am not giving instructions such as "If you're at the right edge, turn down and run down again to the left". All I provide is the rewards, the information about his location, and the option to use 2 actions - turn left and turn right. The rest it learns by himself!
If you have no patience to keep the same browser window open until they are trained, you can load an already trained brains. Click on "Inject trained brains". Then the brain of one beetle which was particularly smart is being injected into all 10 beetles. You'll immediately notice them moving in a smarter way. Note: Those bugs are still quite dumb. At least they don't run in circles anymore. And they don't run into the walls all the time. - A full-time AI specialist could certainly tune the learning algorithm and the so-called hyperparameters to make them smarter 🤓
Moving average of the reward
This kind of machine learning is called "Reinforcement Learning". The variant used above is Deep-Q Learning.