Google has officially launched its upgraded AI model, Gemini 2.0, delivering enhanced performance and multimodal capabilities while paving the way for the future era of AI agents. Additionally, Google's sixth-generation Tensor Processing Unit (TPU) Trillium, first introduced at the I/O Developer Conference in May, is now operational to support Gemini 2.0's training and inference workloads.
AI agents: Redefining productivity and automation
Google CEO Sundar Pichai highlighted the company's shift toward developing agentic AI models. These models are designed to understand the surrounding environment, plan tasks across multiple steps, and take action under user supervision.
According to reports from TechCrunch and VentureBeat, Google has unveiled three AI agent prototypes built on the Gemini 2.0 architecture: Project Astra, Project Mariner, and Jules, each tailored for distinct applications ranging from daily tasks to complex programming and web navigation.
The general-purpose Project Astra can support multilingual conversations, access and integrate data from Google apps and tools like Google Search and Maps, and even retain previous interaction records for seamless continuity.
Project Astra: General-purpose AI for seamless conversations
Project Astra can support multilingual conversations, access and integrate data from Google apps and tools like Search and Maps, and retain conversation history for seamless continuity. For example, users can share a screenshot of a reading list, activate the camera, and let Astra "see" nearby books to recommend the most suitable one as a gift.
Credit: Screenshot from Google
Project Mariner: Web navigation reinvented
Project Mariner, designed for developers and enterprise users, is a Chrome extension powered by Gemini 2.0, enabling automated web navigation. By capturing screenshots and processing them in the cloud, the AI agent can interpret and execute tasks, such as online shopping based on a list.
Jaclyn Konzelmann, Director of Product Management at Google Labs, demonstrated how users can prompt Mariner to locate products, add them to carts, and complete purchases. While Mariner achieved an 83.5% task success rate on WebVoyager benchmarks, it still operates with a slight latency of about 5 seconds between mouse movements.
Konzelmann emphasized that AI-driven web navigation represents a paradigm shift in user experience, potentially reducing the need to visit websites directly—a development that may significantly impact online publishers and retailers.
Credit: Screenshot from Google
Jules: A coding companion for developers
Jules is designed to assist developers by analyzing codebases, suggesting repair plans, and executing fixes across multiple files. It also integrates directly with platforms like GitHub, resembling Microsoft's GitHub Copilot.
Currently, Jules is available only to a select group of trusted testers, with broader access anticipated in early 2025.
Trillium: The engine behind Gemini 2.0
Trillium, a key player behind the scenes, has been instrumental in powering the training and inference of Gemini 2.0, as noted by CEO Pichai. The Trillium chip, compared to its predecessor, delivers a 4.7x performance boost, doubled HBM capacity and bandwidth, and a 67% energy efficiency improvement.
Google has deployed over 100,000 Trillium chips within a single system, supported by its Jupiter network architecture. This configuration achieves a remarkable 13 petabytes per second of data transfer, facilitating the simultaneous utilization of hundreds of thousands of accelerators for a single training task.
Mark Lohmeyer, VP of Compute and AI Infrastructure at Google Cloud, highlighted that testing Llama 2 70B with Trillium showed a clear correlation between performance gains and the number of chips used.