google logo
Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

Context Length:1.0M tokens
Pricing:$0.10M
Created:September 25, 2025

Model Information

1.0M
September 25, 2025
Gemini
text+image->text
fileimagetextaudio
text

Pricing Information

Prompt
$0.10M
per 1M tokens
Completion
$0.40M
per 1M tokens

Supported Parameters

include_reasoningmax_tokensreasoningresponse_formatseedstopstructured_outputstemperaturetool_choicetoolstop_p

Common Use Cases

Multimodal Tasks

  • • Image analysis and description
  • • Visual question answering
  • • Document understanding
  • • Content moderation

Image Generation

  • • Creative artwork generation
  • • Product visualization
  • • Marketing materials
  • • Concept art and design

General Applications

  • • Chatbots and virtual assistants
  • • Educational content creation
  • • Research and analysis
  • • Automation and workflow

Frequently Asked Questions

What is the context length of this model?

This model has a context length of 1.0M tokens, which means it can process and remember up to 1.0M tokens of text in a single conversation or request.

How much does it cost to use this model?

Prompt tokens cost $0.10M/1M tokens and completion tokens cost $0.40M/1M tokens.

What modalities does this model support?

This model supports text+image->text modality, accepting file and image and text and audioas input and producing text as output.

When was this model created?

This model was created on September 25, 2025.

Google: Gemini 2.5 Flash Lite Preview 09-2025 - AI Model Details & Pricing | Zenspeed | Zenspeed