Gemini 3.1 Flash-Lite
LLMLLMMultimodalMultimodalReasoningReasoningTool-UsingTool-Using
Gemini 3.1 Flash-Lite is the most cost-efficient thinking model in Google DeepMind's Gemini 3 series, designed for high throughput and low latency while retaining reasoning quality.
Available on
Technical specification
Context window
Max output
License
Tools
Fine-tuning
Weights access
Knowledge cutoff
Last updated: May 1, 2026
Modalities
Input
Text
Image
Audio
Video
Documents
Output
Text
Code
Capabilities
13Reasoning★
Reasoning
Multi-step reasoning★
Reasoning
Long context★
Reasoning
Multimodal understanding★
Multimodality
Coding★
Coding
Function Calling
Planning
Structured output★
Structured gen.
Audio understanding
Audio
Image understanding★
Vision
Video Understanding
Other
Chart understanding
Vision
Multilingual★
Language
Streaming output
Reasoning
Architecture and technologies
Core Architecture
2Form / Family
2Training Techniques
3Applications
PublicUSDper 1M tokens
Publiczny cennik dostępny przez Gemini API. Cena za 1M tokenów: input $0.25, output $1.50. Najniższa cena modelu myślącego w serii Gemini 3.
Price per 1M tokens · USD
Standard
Input $0.25/1M tokenów, output $1.50/1M tokenów. Prędkość: 363 tokeny/s.
Security and enterprise
Model card dostępny publicznie.
Official Privacy CenterUpdated: May 1, 2026
