The capabilities of multimodal AI | Gemini Demo

11.8M subscribers

3,007,456 views

About
Share

Published On Dec 6, 2023

Our natively multimodal AI model Gemini is capable of reasoning across text, images, audio, video and code. Here are favorite moments with Gemini Learn more and try the model: https://deepmind.google/gemini

Explore Gemini: https://goo.gle/how-its-made-gemini

For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.

Subscribe to our Channel:    / google
Tweet with us on X:   / google
Follow us on Instagram:   / google
Join us on Facebook:   / google

0:00 Intro
0:19 Multimodal Dialogue
1:32 Multilinguality
2:04 Game Creation
2:31 Visual Puzzles
3:17 Making Connections
3:39 Image & Text Generation
4:06 Logic & Spatial Reasoning
4:55 Translating Visuals
5:27 Cultural Understanding

Published On Dec 6, 2023

Share/Embed

Video Link