MIT scientists have developed a new artificial intelligence system that can take still images and generate short videos to simulate what happens next, similar to how humans can visually imagine how a scene will evolve.
Humans intuitively understand how the world works, which makes it easier for people, as opposed to machines, to envision how a scene will play out.
However, objects in a still image could move and interact in a multitude of different ways, making it very hard for machines to accomplish this feat.
The new deep-learning system is able to trick humans 20 per cent of the time when compared to real footage.
Researchers at the Massachusetts Institute of Technology (MIT) in the US pitted two neural networks against each other, with one trying to distinguish real videos from machine-generated ones, and the other trying to create videos that were realistic enough to trick the first system.
When the researchers asked workers on Amazon’s Mechanical Turk crowd-sourcing platform to pick which videos were real, the users picked the machine-generated videos over genuine ones 20 per cent of the time, ‘Live Science’ reported.
The approach could eventually help robots and self-driving cars navigate dynamic environments and interact with humans, or let Facebook automatically tag videos with labels describing what is happening, researchers said.
“Our algorithm can generate a reasonably realistic video of what it thinks the future will look like, which shows that it understands at some level what is happening in the present,” said Carl Vondrick, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory, who led the research.
Publish date: December 12, 2016 9:21 am| Modified date: December 12, 2016 9:21 am