Runway Gen-4 solves AI video’s biggest problem: character consistency across scenes

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Runway AI Inc. launched its most advanced AI video generation model today, entering the next phase of competition to create tools that could transform film production. The new Gen-4 system introduces character and scene consistency across multiple shots — a capability that has evaded most AI video generators until now.
The New York-based startup, backed by Google, Nvidia, and Salesforce, is releasing “Gen-4” to all paid subscribers and enterprise customers, with additional features planned for later this week. Users can generate five and ten-second clips at 720p resolution.
The release comes just days after OpenAI’s image generation feature created a cultural phenomenon, with millions of users requesting Studio Ghibli-style images through ChatGPT. The viral trend became so popular it temporarily crashed OpenAI’s servers, with CEO Sam Altman tweeting that “our GPUs are melting” due to unprecedented demand. The Ghibli-style images also sparked heated debates about copyright, with many questioning whether AI companies can legally mimic distinctive artistic styles.
Visual continuity: The missing piece in AI filmmaking until now
Character and scene consistency — maintaining the same visual elements across multiple shots and angles — has been the Achilles’ heel of AI video generation. When a character’s face subtly changes between cuts or a background element disappears without explanation, the artificial nature of the content becomes immediately apparent to viewers.
The challenge stems from how these models work at a fundamental level. Previous AI generators treated each frame as a separate creative task, with only loose connections between them. Imagine asking a room full of artists to each draw one frame of a film without seeing what came before or after — the result would be visually disjointed.
Runway’s Gen-4 appears to have tackled this problem by creating what amounts to a persistent memory of visual elements. Once a character, object, or environment is established, the system can render it from different angles while maintaining its core attributes. This isn’t just a technical improvement; it’s the difference between creating interesting visual snippets and telling actual stories.
Using visual references, combined with instructions, Gen-4 allows you to create new images and videos with consistent styles, subjects, locations and more. Allowing for continuity and control within your stories.
To test the model’s narrative capabilities, we have put together… pic.twitter.com/IYz2BaeW2U
— Runway (@runwayml) March 31, 2025
According to Runway’s documentation, Gen-4 allows users to provide reference images of subjects and describe the composition they want, with the AI generating consistent outputs from different angles. The company claims the model can render videos with realistic motion while maintaining subject, object, and style consistency.
To showcase the model’s capabilities, Runway released several short films created entirely with Gen-4. One film, “New York is a Zoo,” demonstrates the model’s visual effects by placing realistic animals in cinematic New York settings. Another, titled “The Retrieval,” follows explorers searching for a mysterious flower and was produced in less than a week.
From facial animation to world models: Runway’s AI filmmaking evolution
Gen-4 builds on Runway’s previous tools. In October, the company released Act-One, a feature that allows filmmakers to capture facial expressions from smartphone video and transfer them to AI-generated characters. The following month, Runway added advanced 3D-like camera controls to its Gen-3 Alpha Turbo model, enabling users to zoom in and out of scenes while preserving character forms.
This trajectory reveals Runway’s strategic vision. While competitors focus on creating ever more realistic single images or clips, Runway has been assembling the components of a complete digital production pipeline. The approach feels more akin to how actual filmmakers work — addressing problems of performance, coverage, and visual continuity as interconnected challenges rather than isolated technical hurdles.
The evolution from facial animation tools to consistent world models suggests Runway understands that AI-assisted filmmaking needs to follow the logic of traditional production to be truly useful. It’s the difference between creating a tech demo and building tools professionals can actually incorporate into their workflows.
AI video’s billion-dollar battle heats up
The financial implications are substantial for Runway, which is reportedly raising a new funding round that would value the company at $4 billion. According to financial reports, the startup aims to reach $300 million in annualized revenue this year following the launch of new products and an API for its video-generating models.
Runway has pursued Hollywood partnerships, securing a deal with Lionsgate to create a custom AI video generation model based on the studio’s catalog of more than 20,000 titles. The company has also established the Hundred Film Fund, offering filmmakers up to $1 million to produce movies using AI.
“We believe that the best stories are yet to be told, but that traditional funding mechanisms often overlook new and emerging visions within the larger industry ecosystem,” Runway explains on its fund’s website.
However, the technology raises concerns for film industry professionals. A 2024 study commissioned by the Animation Guild found that 75% of film production companies that have adopted AI have reduced, consolidated, or eliminated jobs. The study projects that more than 100,000 U.S. entertainment jobs will be affected by generative AI by 2026.
Copyright questions follow AI’s creative explosion
Like other AI companies, Runway faces legal scrutiny over its training data. The company is currently defending itself in a lawsuit brought by artists who allege their copyrighted work was used to train AI models without permission. Runway has cited the fair use doctrine as its defense, though courts have yet to definitively rule on this application of copyright law.
The copyright debate intensified last week with OpenAI’s Studio Ghibli feature, which allowed users to generate images in the distinctive style of Hayao Miyazaki’s animation studio without explicit permission. Unlike OpenAI, which refuses to generate images in the style of living artists but permits studio styles, Runway has not publicly detailed its policies on style mimicry.
This distinction feels increasingly arbitrary as AI models become more sophisticated. The line between learning from broad artistic traditions and copying specific creators’ styles has blurred to near invisibility. When an AI can perfectly mimic the visual language that took Miyazaki decades to develop, does it matter whether we’re asking it to copy the studio or the artist himself?
When questioned about training data sources, Runway has declined to provide specifics, citing competitive concerns. This opacity has become standard practice among AI developers but remains a point of contention for creators.
As marketing agencies, educational content creators, and corporate communications teams explore how tools like Gen-4 could streamline video production, the question shifts from technical capabilities to creative application.
For filmmakers, the technology represents both opportunity and disruption. Independent creators gain access to visual effects capabilities previously available only to major studios, while traditional VFX and animation professionals face an uncertain future.
The uncomfortable truth is that technical limitations have never been what prevents most people from making compelling films. The ability to maintain visual continuity won’t suddenly create a generation of storytelling geniuses. What it might do, however, is remove enough friction from the process that more people can experiment with visual narrative without needing specialized training or expensive equipment.
Perhaps the most profound aspect of Gen-4 isn’t what it can create, but what it suggests about our relationship with visual media going forward. We’re entering an era where the bottleneck in production isn’t technical skill or budget, but imagination and purpose. In a world where anyone can create any image they can describe, the important question becomes: what’s worth showing?
As we enter an era where creating a film requires little more than a reference image and a prompt, the most pressing question isn’t whether AI can make compelling videos, but whether we can find something meaningful to say when the tools to say anything are at our fingertips.