Gemini 2.5 Flash-Lite
Fastest and most budget-friendly multimodal model in the Gemini 2.5 family.
Context window
1,048,576 tokens
Max output
65,536 tokens
Input price
Per 1M input tokens
Output price
Per 1M output tokens
Modalities
Input: Text, Image, Video, Audio; output: Text
API surface
2 supported endpoints
Overview
Where this model fits best
Use this section to quickly decide whether the model belongs in chat, coding, reasoning, embedding, rerank, vision, audio, or agent workflows.
Use cases
What this model should be considered for
Best fit
Use this model when you need a well-documented, structured option inside the registry and want a single place to inspect pricing, capabilities, and operational limits.
Capabilities
What you can actually do with it
Feature flags, API endpoints, and tool support are separated so integration constraints are easy to scan.
Capabilities
High-level features exposed by the model runtime.
Endpoints
APIs and surfaces this model can be called through.
Pricing
Pricing and billing signals
Known prices are shown per 1M tokens. Missing official prices are marked as not published instead of being treated as free.
Standard API
Input
Output
Default request pricing when using the primary endpoint.
Controls
Runtime knobs worth knowing
Supported request parameters help developers understand sampling, output limits, reasoning controls, and structured output behavior.
Temperature
0 to 2
Top P
0 to 1
Presence Penalty
-2 to 2
Frequency Penalty
-2 to 2
Response Format
text, json_object
Specifications
Technical reference
Canonical identifiers, family, modalities, token limits, training information, and update metadata.
Model ID
gemini-2.5-flash-liteUse this exact identifier in API calls and SDK configuration.
Provider
Family
Access type
Input modalities
Output modalities
Max output tokens