Google DeepMind の最新モデルはリアルなオーディオを約束 • The Register

ByManagetech

6月 20, 2024

Google’s DeepMind developed a video-to-audio model that generates sound matching video samples.
The model encodes video into a compressed representation and refines random noise into audio relevant to the input footage.
DeepMind used various datasets, including AI-generated annotations, to associate visual events with different sounds.
The model can generate audio with or without a text prompt but faces challenges like poor audio quality from low-quality videos.
DeepMind believes the model can complement video generation models to create entirely AI-generated videos with soundtracks and dialogues.
Other companies like OpenAI and Kuaishou are also developing video generation models with advanced features.
Runway unveiled its Gen-3 Alpha model trained on videos and images with descriptive captions for immersive transitions and camera movements.
Pika secured $80 million in funding to enhance its AI video generation and editing platform, offering features like fine-grain editing and sound effects.

Generative AI technologies are expanding into video generation, with companies like DeepMind, OpenAI, Kuaishou, Runway, and Pika developing advanced models. These models aim to create AI-generated videos with soundtracks and immersive features, potentially revolutionizing the media production industry. The evolution of video generation models shows promising advancements in artificial intelligence, although challenges such as audio quality and visual artifacts persist.

元記事: https://www.theregister.com/2024/06/18/google_deepmind_video/

Google DeepMind の最新モデルはリアルなオーディオを約束 • The Register

ByManagetech

By Managetech

Related Post

ホライゾンの俳優アシュリー・バーチは、ソニーのAIアロイのビデオを見て「ゲームパフォーマンスという芸術形式に不安を感じた」と語る – IGN

JFrogとNVIDIAが提携し、安全なAI導入を強化

Mistral AI が、わずかなパラメータで GPT-4o Mini を上回る新しいオープンソースモデルをリリース | VentureBeat

You missed

ホライゾンの俳優アシュリー・バーチは、ソニーのAIアロイのビデオを見て「ゲームパフォーマンスという芸術形式に不安を感じた」と語る – IGN

JFrogとNVIDIAが提携し、安全なAI導入を強化

Mistral AI が、わずかなパラメータで GPT-4o Mini を上回る新しいオープンソースモデルをリリース | VentureBeat

AI とヒューマノイドが 2025 年のロボットのトップトレンドに | ASSEMBLY