Google: Gemini 2.5 Flash Lite Preview 06-17
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.
Model Information
Pricing Information
Supported Parameters
Common Use Cases
Multimodal Tasks
- • Image analysis and description
- • Visual question answering
- • Document understanding
- • Content moderation
Image Generation
- • Creative artwork generation
- • Product visualization
- • Marketing materials
- • Concept art and design
General Applications
- • Chatbots and virtual assistants
- • Educational content creation
- • Research and analysis
- • Automation and workflow
Frequently Asked Questions
What is the context length of this model?
This model has a context length of 1.0M tokens, which means it can process and remember up to 1.0M tokens of text in a single conversation or request.
How much does it cost to use this model?
Prompt tokens cost $0.10M/1M tokens and completion tokens cost $0.40M/1M tokens.
What modalities does this model support?
This model supports text+image->text modality, accepting file and image and text and audioas input and producing text as output.
When was this model created?
This model was created on June 17, 2025.