Just a day after OpenAI presented its innovative GPT-4o, capable of understanding and discussing what is happening in a video, Google responded with an equally interesting announcement. During the Google I/O conference keynote in Mountain View, California, Demis Hassabis, CEO of Google DeepMind, presented Project Astraa research prototype that promises similar capabilities in video understanding.
Project Astra: the new frontier of AI research
During the demonstration, Project Astra displayed its extraordinary abilities. The model was able to identify objects that produce sounds, create creative alliteration, explain codes on a monitor and locate misplaced objects. This AI has also revealed its potential in wearable devices such as smart glasses, where it can analyze diagramssuggest improvements and generate witty responses to visual requests.
How does Project Astra work?
By continuously processing and encoding video frames and voice input, Astra creates a timeline of events and stores information for quick recall. This ability to track and remember events in real time opens up new possibilities for AI assistance, making interactions more fluid and contextually relevant. For example, as seen in the video, if you show Astra a stylized sketch of Albert Einstein, it will be able to recognize it and provide you with all the relevant information, writing it down on a piece of paper.
Future of Project Astra: Towards more advanced AI assistants
While Project Astra is still in an early stage with no specific launch plans, Google has hinted that some of its capabilities might be integrated into products within the year. Google CEO Sundar Pichai described this development as an attempt to create an AI agent with real “agency,” capable of “think ahead, reason and plan on your own“. This vision aims to transform interaction with artificial intelligence, making it not only more intuitive but also proactive. While OpenAI has already impressed with GPT-4o, Google shows it is ready to compete with equally revolutionary innovations.