🟠 Important AI Summary 2026-06-13 01:30 (JST) · Source: Runway

Gen-2 generates video from text, aiming for film production.

Runway Research | Scale, Speed and Stepping Stones: The path to Gen-2

Original: Runway Research | Scale, Speed and Stepping Stones: The path to Gen-2

Importance: 新しいビデオ生成技術が大きな影響を及ぼすため

Summary

Anastasis Germanidis, CTO and co-founder of Runway, discusses the development journey of Gen-2, a text-to-video system that allows direct text-guided video generation without structural conditioning. The focus is on achieving high fidelity and temporal stability in video generation. Ultimately, the goal is to enable the generation of a two-hour film, emphasizing the need for broad systems for storytelling and creativity.

Key Points

Gen-2 directly generates video from text
Focus on high fidelity and temporal stability
Aiming to generate a two-hour film
Predicts motion without structural conditioning
Broad systems for video generation needed

View developer notes (APIs, breaking changes, migration)

Runway's Gen-2 addresses temporal consistency issues using latent diffusion architecture for video generation. While Gen-1 relied on input video for structural conditioning, Gen-2 eliminates this, allowing generation from any starting image. Recent updates enable generating videos from arbitrary starting frames. The model aims to gain a deep understanding of the visual world through the next-frame prediction task.

モデル新機能Audience: 一般ユーザーAudience: 開発者

Source: https://runwayml.com/research/scale-speed-and-stepping-stones-the-path-to-gen-2

Outlet: Runway

This article is an AI-generated summary (OpenAI GPT-4o-mini) of publicly available information from Anthropic, OpenAI, Google, Meta, Mistral, DeepSeek, Sakana, and other vendors. The original source URL is always provided in accordance with fair-use citation requirements. Summaries are AI-generated and may contain mistranslations or misinterpretations. Always verify details with the original source.