Artificial intelligence methods are often perceived as extremely complicated and understandable only by people that graduated computer science. The idea of an electronic device that, by tself, learns how to behave still seems to be very futuristic. In fact, we have many “smart” devices around us, but many of them are not smarter than a motion-based light switch. Yet, our world is full of intelligent agents: rats, cats, dogs, cockroaches and humans. The secret behind their behaviors lies in millions of years of evolution and something that we all know: trials and errors, rewards and punishments. 

Let’s imagine a rat in a specially designed rat cage that has a button and a food supplier. When the button is pressed, a portion of food is supplied. A rat placed in such cage for the first time will use its senses to more or less randomly explore his new environment. At some point, the rat will press the button and receive a reward – some food will be supplied. Over time this situation will repeat and finally, the rat will learn that pressing the button releases the food. From now on, the rat will exploit his knowledge each time he feels hunger. 

Now, consider a robotic vacuum cleaner – like one of those currently available on the market. Such a vacuum cleaner is able to autonomously move around the room and clean the carpet or the floor. It is smart enough to avoid collisions with various objects, it may also automatically return for recharging or be used by a cat as a taxi. Let us however assume that the vacuum cleaner comes with no predefined “intelligent” behavior at all – it just moves randomly hitting everything around. Would it be possible for the vacuum cleaner to learn by itself how to behave? Could it correct its behavior just by knowing that it done something wrong? In fact it could – using reinforcement learning.

Read whole article here.