返回开发者专区官方定价 官方定价 官方定价 官方定价 官方定价 官方定价
Google 模型
探索 Google 的所有 6 个模型,包括详细定价、优缺点和开发者推荐。
6
模型
$0.100
最低输入价格
1.0M
最大上下文
3
质量层级
快速推荐
最佳性价比: Gemini 2.0 Flash ($0.100/1M)
最佳质量: Gemini 3.1 Pro
Gemini 3.1 Pro
FlagshipAdvanced reasoning, coding
适用场景: Best Google model for complex reasoning and professional coding tasks.
核心升级
- ◆MMLU: 89.2% — significantly improved reasoning over Gemini 2.5 Pro
- ◆1M context with 65K output — same context, better quality
- ◆50% batch discount ($1/$6) — cheapest batch flagship available
- ◆Context caching: $0.50/M — 75% savings for repeated prefixes
- ◆Paid-only preview — no free tier, but production-grade reliability
输入价格
$2.00
per 1M tokens
输出价格
$12.00
per 1M tokens
缓存输入
$0.500
per 1M tokens
批量输入
$1.00
per 1M tokens
上下文窗口: 1M
最大输出: 65,536 tokens
知识截止日期: 2025-06
视觉函数调用微调JSON 模式
优点
- 1M context with 65K output
- Significantly improved reasoning over 3.0
- 50% batch discount available
缺点
- No free tier (paid-only preview)
- Long-context input 2x above 200K
- Still maturing
性能
输出速度~55 tok/s
速率限制5,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
89.2%
SWE-bench Verified
68.0%
GPQA
74.5%
Gemini 3.5 Flash
Mid-tierFast, intelligent generation
适用场景: Best balance of speed and intelligence for production apps needing multimodal capability.
核心升级
- ◆Frontier intelligence at mid-tier price — matches Gemini 2.5 Pro quality
- ◆1M context + 65K output at $1.50/$9 — 8x cheaper than flagship
- ◆Free tier available with rate limits — best free-tier model from Google
- ◆Context caching: $0.375/M — 75% savings for cached input
- ◆Multimodal (text + vision) at flash speed — sub-1s typical latency
输入价格
$1.50
per 1M tokens
输出价格
$9.00
per 1M tokens
缓存输入
$0.375
per 1M tokens
批量输入
$0.750
per 1M tokens
上下文窗口: 1M
最大输出: 65,536 tokens
知识截止日期: 2025-06
视觉函数调用微调JSON 模式免费层级
优点
- Frontier intelligence at mid-tier price
- 1M context + 65K output
- Free tier with rate limits
缺点
- More expensive than 3.1 Flash-Lite
- Output cost at $9/M is significant at scale
- No fine-tuning
性能
输出速度~95 tok/s
速率限制10,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
86.5%
HumanEval
85.0%
MATH
73.0%
Gemini 3.1 Flash-Lite
LiteHigh-volume, low-latency tasks
适用场景: Best budget model for high-volume classification, extraction, and lightweight generation.
核心升级
- ◆1M context at $0.25/M input — 6x cheaper than Gemini 2.5 Flash
- ◆65K max output — same as Pro tier at lite pricing
- ◆Context caching: $0.0625/M — 90% savings, best caching value
- ◆Batch API: $0.125/$0.75 — 50% savings for async processing
- ◆Free tier with rate limits — unlimited prototyping at zero cost
输入价格
$0.250
per 1M tokens
输出价格
$1.50
per 1M tokens
缓存输入
$0.063
per 1M tokens
批量输入
$0.125
per 1M tokens
上下文窗口: 1M
最大输出: 65,536 tokens
知识截止日期: 2025-04
视觉函数调用微调JSON 模式免费层级
优点
- 1M context at $0.25/M input — incredible value
- 65K max output
- 90% context caching savings
缺点
- Quality below 3.5 Flash for complex tasks
- No fine-tuning
- Newer model, less battle-tested
性能
输出速度~130 tok/s
速率限制15,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
79.0%
HumanEval
74.5%
Gemini 2.5 Pro
FlagshipComplex reasoning, long context
适用场景: Best value flagship model — 1M context at lower cost than GPT-4.1 or Claude Sonnet.
核心升级
- ◆Cheapest flagship input: $1.25/M — 40% cheaper than GPT-4.1
- ◆1M context window with 65K output — largest output among flagships
- ◆Context caching: $0.315/M — 75% savings for repeated prefixes
- ◆Free tier available — only flagship model with free access
- ◆MMLU: 85.4% — competitive with GPT-4.1 at half the cost
输入价格
$1.25
per 1M tokens
输出价格
$10.00
per 1M tokens
缓存输入
$0.315
per 1M tokens
批量输入
—
per 1M tokens
上下文窗口: 1.0M
最大输出: 65,536 tokens
知识截止日期: 2025-01
视觉函数调用微调JSON 模式免费层级
优点
- 1M context window
- Cheapest flagship input at $1.25
- 65K max output
缺点
- No batch API
- No fine-tuning
- Higher output cost
性能
输出速度~50 tok/s
速率限制5,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
85.4%
SWE-bench Verified
63.8%
MATH
72.0%
使用此模型的智能体
2Gemini 2.5 Flash
Mid-tierFast, efficient generation
适用场景: Incredible value for production apps — 1M context for $0.15/1M input tokens.
核心升级
- ◆Same 1M context as Pro at 1/8th the input price ($0.15 vs $1.25/M)
- ◆65K max output — identical to Pro tier for generation capacity
- ◆Context caching: $0.0375/M — 97% savings for cached prefixes
- ◆Free tier with generous rate limits — production-ready at zero cost
- ◆Sub-second latency — fastest Google model for real-time apps
输入价格
$0.150
per 1M tokens
输出价格
$0.600
per 1M tokens
缓存输入
$0.037
per 1M tokens
批量输入
—
per 1M tokens
上下文窗口: 1.0M
最大输出: 65,536 tokens
知识截止日期: 2025-01
视觉函数调用微调JSON 模式免费层级
优点
- Same 1M context as Pro at 1/8th price
- 65K max output
- Best cached input savings
缺点
- No batch API
- Quality below Pro for hard tasks
- No fine-tuning
性能
输出速度~100 tok/s
速率限制10,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
83.0%
HumanEval
82.5%
MATH
68.0%
Gemini 2.0 Flash
LiteUltra-fast, low cost
适用场景: Best budget option for high-volume extraction, classification, and fine-tuned models.
核心升级
- ◆Cheapest model with 1M context: $0.10/M input — lowest in market
- ◆Only lite model with fine-tuning + vision — full feature set
- ◆Context caching: $0.025/M — 97.5% savings for repeated prefixes
- ◆8K max output is low — upgrade to 2.5 Flash for 65K output
- ◆Fine-tuning support — best for domain-specific classification models
输入价格
$0.100
per 1M tokens
输出价格
$0.400
per 1M tokens
缓存输入
$0.025
per 1M tokens
批量输入
—
per 1M tokens
上下文窗口: 1.0M
最大输出: 8,192 tokens
知识截止日期: 2024-08
视觉函数调用微调JSON 模式免费层级
优点
- Cheapest model in market with 1M context
- Only lite model with fine-tuning + vision
- Full feature set
缺点
- 8K max output is low
- Older knowledge cutoff
- Quality below 2.5 Flash
性能
输出速度~140 tok/s
速率限制15,000 RPM
多模态能力
图像输入图像输出音频输入音频输出
基准测试
MMLU
78.5%
HumanEval
75.0%
MATH
62.0%
并排比较
| 模型 | 层级 | 输入 | 输出 | 上下文 |
|---|---|---|---|---|
| Gemini 3.1 Pro | Flagship | $2.00 | $12.00 | 1M |
| Gemini 3.5 Flash | Mid-tier | $1.50 | $9.00 | 1M |
| Gemini 3.1 Flash-Lite | Lite | $0.250 | $1.50 | 1M |
| Gemini 2.5 Pro | Flagship | $1.25 | $10.00 | 1.0M |
| Gemini 2.5 Flash | Mid-tier | $0.150 | $0.600 | 1.0M |
| Gemini 2.0 Flash | Lite | $0.100 | $0.400 | 1.0M |