Robots AtlasRobots Atlas
Gemini Robotics 1.5 Logo
Preview
Apr 14, 2026
Hosted UICloud

Gemini Robotics 1.5

MultimodalRobotics FMVLA

Vision-Language-Action (VLA) model by Google DeepMind that converts visual inputs and language instructions into motor commands for robots.

Technical specification

Context window
0K
Tools
No
Fine-tuning
No
Weights access
Closed
Last updated: May 2, 2026

Modalities

Input
Text
Image
Output
Text
action

Capabilities

6

Reasoning

Reasoning

Multi-step reasoning

Reasoning

Planning

Planning

Image understanding

Vision

Multimodal understanding

Multimodality

Multilingual

Language

Applications