Explaining OpenAI Sora’s Spacetime Patches: The Key Ingredient
Explaining OpenAI Sora’s Spacetime Patches: The Key Ingredient

Explaining OpenAI Sora’s Spacetime Patches: The Key Ingredient

Author
Created
Feb 23, 2024 03:52 AM
Tags
technology
Type
文章
Date
Feb 15, 2024
Content
How Sora’s Unique Approach Transforms Video Generation In the world of generative models we have seen a number of approaches from GAN’s to auto-regressive, and diffusion models, all with their own strengths and limitations. Sora now introduces a paradigm shift with a new modelling techniques and flexibility to handle a broad range of duration's, aspect ratios, and resolutions. Sora combines both diffusion and transformer architectures together to create a diffusion transformer model and is able to provide features such as: • Text-to-videoAs we have seenImage-to-video: Bringing life to still images • Video-to-video: Changing the style of video to something else • Extending video in time: Forwards and backwards • Create seamless loops: Tiled videos that seem like they never end • Image generation: Still image is a movie of one frame (up to 2048 x 2048) • Generate video in any format: From 1920 x 1080 to 1080 x 1920 and everything in between • Simulate virtual worlds: Like Minecraft and other video games • Create a video: Up to 1 minute in length with multiple shorts
media
https://medium.com/towards-data-science/explaining-openai-soras-spacetime-patches-the-key-ingredient-e14e0703ec5b