Glossary term
Glossary term
Multimodal AI
Generative AI capability producing video clips from natural language descriptions.
OpenAI Sora (2024) generates up to 60-second photorealistic videos from text prompts - demonstrated generating a dolly-shot of Tokyo at night and a mammoth walking through snow with temporal consistency.
Runway Gen-3 Alpha is used by advertising agencies to generate B-roll footage, product demonstrations, and visual effects shots - reducing production budgets for 30-second TV commercials by 40%.
Google Lumiere uses a Space-Time U-Net diffusion model for text-to-video and video editing - demonstrated stylising existing videos and generating motion-consistent animated portraits from single images.