Intel made general public demonstration on 2017 SXSW Festival to show photos recognition system that could understand almost instantly what objects were presented in from of its camera even it was 20 objects at the time. Full scale object recognition system is being designed for customers like autonomous cars makers so that driving system could differentiate e.g. between bus and a tramway instead of just seeing two big objects on the road. Intel also published Deep Learning Software Development Kit to bring object recognition to applications.
It seems to be endless number of applications for such a AI ability especially that we become more and more dependent on visual information. Security systems could prioritize threat based on understanding if it is an animal or a person. Home artificial intelligence systems could welcome people entering home and tune lights, temperature and music to particular family member. Visually impaired people could navigate more easily when they walk hearing who and what is in front of them.
Understanding what you hear is as important for AI as sight. Audio Analytic company as working on sounds library for AI systems in sound-proofed hangar in RAF airbase north of Cambridge in UK. It must be very untypical job to smash windows and make dog bark all day but this is how we teach AI to differentiate if glass felt of the table or if window has been broken. Nowadays everyday noises are just background to most machines. „What we’re working on is a new field of AI that we call artificial audio intelligence,” says Chris Mitchell from Audio Analytic. „We want to create a taxonomy of all sounds, and that is a huge undertaking,” says Mitchell. So far, the company’s software can identify breaking windows, crying babies and smoke alarms. At the 2017 Consumer Electronics Show in Las Vegas, they added barking dogs to their repertoire.
There is a new wave of companies training machine learning systems to spot patterns in sounds. Uberchord, based in Berlin, is developing an AI that can help people learn to play guitar. It listens to you strum and tells you when you have your fingering wrong.
Audio Analytic work showed that what seems to be obvious to us like lexicon of meanings of children’s cry appeared to be much more difficult to learn to AI that say recognizing barking of different breeds of dogs. Another sound they want to teach their system to look out for is the pitch and intonation changes of aggressive human shouts – somebody threatening violence, say. This doesn’t vary much with language or culture, says Mitchell. Distinctive changes in vocal sounds come when adrenalin floods the body and affects the voice box. Audio Analytic has had to put this one on hold, however. They found that the sounds of chickens and chainsaws in a neighbourhood would also trigger their aggression detector.
We are learning AI to see and hear and very soon AI will probably prove it can do it better the we do by spotting several objects on the road in the same time when we can focus on one at the time. Let’s just remind old auto-focus function in our cameras that soon proved to be much more accurate than manual focus and human eye.
Sources and media: