The , which showcases the capabilities of the AI in a back-and-forth setting, was edited to make the AI appear more competent, according to Google.
Here’s what happened:
The video starts with a person asking questions to the AI, such as “Will this duck float?” or “Can you guess the country based on this map?”. The AI responds accurately to these prompts, identifying the material of the duck, determining the location of a hidden ball during a magic trick, and even playing a game of “guess the country” based on a world map .
However, the AI’s responses in the video were not entirely authentic. In reality, the AI was fed a series of still images and text prompts for each scene in the video.
For example, when the person holds up a rubber duck, the AI correctly identifies the material and makes the prediction based on a text prompt explaining the properties of rubber.
Similarly, during the magic trick, the AI’s responses were determined by pre-recorded images showing the sequence of events .
Despite the edited nature of the video, Google confirmed that the AI was indeed responding to real prompts and outputs, not just still images and text .
In a further attempt to showcase Gemini’s capabilities, the AI was also made to respond to voice commands and video, although these were also pre-recorded .
Finally, the world map scene was used to demonstrate another AI feature, where the AI was asked to come up with a game idea based on clues from the map .
In light of this information, it’s important to understand that the video was edited to make the AI’s responses appear more natural and seamless, but it was not a real-time demonstration of the AI’s capabilities.
Despite the edited nature of the video, the AI did demonstrate impressive capabilities, such as identifying material from still images, predicting outcomes based on text prompts, and responding to voice and video inputs. It’s also worth noting that Google has confirmed that the AI was indeed responding to real prompts and outputs, not just still images and text. This demonstrates that, with proper training and resources, AI can indeed exhibit impressive capabilities, even in highly complex tasks like understanding and responding to human speech and gestures.