Google: Gemini 2.5 Flash Lite Preview 06-17

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

Context Length:1.0M tokens

Pricing:$0.10M

Created:June 17, 2025

Model Information

Context Length

1.0M

Created

June 17, 2025

TokenizerGemini

Modalitytext+image->text

Input Modalities

fileimagetextaudio

Output Modalities

text

Pricing Information

Prompt

$0.10M

per 1M tokens

Completion

$0.40M

per 1M tokens

Supported Parameters

include_reasoningmax_tokensreasoningresponse_formatseedstopstructured_outputstemperaturetool_choicetoolstop_p

Common Use Cases

Multimodal Tasks

• Image analysis and description
• Visual question answering
• Document understanding
• Content moderation

Image Generation

• Creative artwork generation
• Product visualization
• Marketing materials
• Concept art and design

General Applications

• Chatbots and virtual assistants
• Educational content creation
• Research and analysis
• Automation and workflow

Frequently Asked Questions

What is the context length of this model?

This model has a context length of 1.0M tokens, which means it can process and remember up to 1.0M tokens of text in a single conversation or request.

How much does it cost to use this model?

Prompt tokens cost $0.10M/1M tokens and completion tokens cost $0.40M/1M tokens.

What modalities does this model support?

This model supports text+image->text modality, accepting file and image and text and audioas input and producing text as output.

When was this model created?

This model was created on June 17, 2025.