Odyssey’s AI model transforms video into interactive worlds

Screenshot of virtual TVs as London-based AI lab Odyssey launches a research preview of a model transforming video into interactive worlds.

London-based AI lab Odyssey has launched a research preview of a model transforming video into interactive worlds. Initially focusing on world models for film and game production, the Odyssey team has stumbled onto potentially a completely new entertainment medium.

The interactive video generated by Odyssey’s AI model responds to inputs in real-time. You can interact with it using your keyboard, phone, controller, or eventually even voice commands. The folks at Odyssey are billing it as an “early version of the Holodeck.”

The underlying AI can generate realistic-looking video frames every 40 milliseconds. That means when you press a button or make a gesture, the video responds almost instantly—creating the illusion that you’re actually influencing this digital world.

“The experience today feels like exploring a glitchy dream—raw, unstable, but undeniably new,” according to Odyssey. We’re not talking about polished, AAA-game quality visuals here, at least not yet.

Not your standard video tech

Let’s get a bit technical for a moment. What makes this AI-generated interactive video tech different from, say, a standard video game or CGI? It all comes down to something Odyssey calls a “world model.”

Unlike traditional video models that generate entire clips in one go, world models work frame-by-frame to predict what should come next based on the current state and any user inputs. It’s similar to how large language models predict the next word in a sequence, but infinitely more complex because we’re talking about high-resolution video frames rather than words.

“A world model is, at its core, an action-conditioned dynamics model,” as Odyssey puts it. Each time you interact, the model takes the current state, your action, and the history of what’s happened, then generates the next video frame accordingly.

The result is something that feels more organic and unpredictable than a traditional game. There’s no pre-programmed logic saying “if a player does X, then Y happens”—instead, the AI is making its best guess at what should happen next based on what it’s learned from watching countless videos.

Odyssey tackles historic challenges with AI-generated video

Building something like this isn’t exactly a walk in the park. One of the biggest hurdles with AI-generated interactive video is keeping it stable over time. When you’re generating each frame based on previous ones, small errors can compound quickly (a phenomenon AI researchers call “drift.”)

To tackle this, Odyssey has used what they term a “narrow distribution model”—essentially pre-training their AI on general video footage, then fine-tuning it on a smaller set of environments. This trade-off means less variety but better stability so everything doesn’t become a bizarre mess.

The company says they’re already making “fast progress” on their next-gen model, which apparently shows “a richer range of pixels, dynamics, and actions.”

Running all this fancy AI tech in real-time isn’t cheap. Currently, the infrastructure powering this experience costs between £0.80-£1.60 (1-2) per user-hour, relying on clusters of H100 GPUs scattered across the US and EU.

That might sound expensive for streaming video, but it’s remarkably cheap compared to producing traditional game or film content. And Odyssey expects these costs to tumble further as models become more efficient.

Interactive video: The next storytelling medium?

Throughout history, new technologies have given birth to new forms of storytelling—from cave paintings to books, photography, radio, film, and video games. Odyssey believes AI-generated interactive video is the next step in this evolution.

If they’re right, we might be looking at the prototype of something that will transform entertainment, education, advertising, and more. Imagine training videos where you can practice the skills being taught, or travel experiences where you can explore destinations from your sofa.

The research preview available now is obviously just a small step towards this vision and more of a proof of concept than a finished product. However, it’s an intriguing glimpse at what might be possible when AI-generated worlds become interactive playgrounds rather than just passive experiences.

You can give the research preview a try here.

See also: Telegram and xAI forge Grok AI deal

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Source link