OpenAI unveils AI that instantly generates dazzling videos

Tech & Gadgets

In April, a New York startup called Runway AI unveiled technology that allows people to generate videos, like a cow at a birthday party or a dog chatting on a smartphone, simply by typing a sentence into a box on a computer screen.

The four-second videos were blurry, jerky, distorted and disturbing. But they were a clear sign that artificial intelligence technologies would deliver increasingly compelling videos in the coming months and years.

Just ten months later, San Francisco startup OpenAI unveiled a similar system that creates videos that look like they were lifted from a Hollywood movie. One demonstration included short videos — shot in minutes — of woolly mammoths trotting through a snowy meadow, a monster staring at a melting candle and a Tokyo street scene apparently filmed by a camera flying around the city.

OpenAI, the company behind the ChatGPT chatbot and the still image generator DALL-E, is one of many companies racing to improve these types of instant video generators, including startups like Runway and tech giants like Google and Meta, the owner of Facebook and Instagram . The technology could speed up the work of seasoned filmmakers while completely replacing less experienced digital artists.

It could also become a quick and cheap way to create disinformation online, making it even harder to determine what is real on the Internet.

“I'm absolutely terrified that this kind of thing will affect a closely contested election,” said Oren Etzioni, a professor at the University of Washington who specializes in artificial intelligence. He is also the founder of True Media, a nonprofit organization that works to identify disinformation online in political campaigns.

OpenAI calls its new system Sora, after the Japanese word for air. The team behind the technology, including researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of limitless creative potential.”

In an interview, they also said that the company is not yet releasing Sora to the public because it is still working to understand the dangers of the system. Instead, OpenAI will share the technology with a small group of academics and other outside researchers who will turn it into a “red team,” a term for looking for ways it can be abused.

“The intention here is to provide a taste of what's on the horizon, so people can see the possibilities of this technology – and we can get feedback,” said Dr. Brooks.

OpenAI already tags videos produced by the system with watermarks that identify them as being generated by AI. But the company acknowledges that these can be removed. They can also be difficult to spot. (The New York Times added “Generated by AI” watermarks to the videos accompanying this story.)

The system is an example of generative AI, which can create text, images and sounds on the fly. Like other generative AI technologies, OpenAI's system learns by analyzing digital data – in this case, videos and captions that describe what those videos contain.

OpenAI declined to say how many videos the system learned from or where they came from, other than to say the training included both publicly available videos and those licensed by copyright holders. The company says little about the data used to train its technologies, most likely because it wants to maintain a competitive edge – and has been sued several times for using copyrighted material.

(The New York Times sued OpenAI and its partner Microsoft in December for copyright infringement of news content related to AI systems.)

Sora creates videos in response to short descriptions, such as “a beautifully rendered paper world of a coral reef, full of colorful fish and sea creatures.” While the videos can be impressive, they are not always perfect and may contain strange and illogical images. For example, the system recently generated a video of someone eating a cookie, but the cookie never shrank.

DALL-E, Midjourney and other still image generators have improved so rapidly in recent years that they now produce images that are almost indistinguishable from photographs. This has made it harder to identify misinformation online, and many digital artists complain that it has become harder for them to find work.

“We all laughed in 2022 when Midjourney first came out and said, 'Oh, that's cute,'” says Reid Southen, a film concept artist in Michigan. “Now people are losing their jobs because of Midjourney.”