OpenAI Модели
Изучите все 12 моделей от OpenAI с подробными ценами, плюсами и минусами, а также рекомендациями для разработчиков.
Быстрые рекомендации
GPT-5.5
FlagshipAgentic coding, complex workloads
Когда использовать: Best for complex coding agents, computer use, and professional workloads where accuracy matters most.
Ключевые улучшения
- ◆SWE-bench Verified: 82.7% (+6.5pp vs GPT-5.4's 76.2%)
- ◆Terminal-Bench: 82.7% — new SOTA for agentic coding
- ◆1M context window retained, 33% fewer tokens vs GPT-5.4 at same latency
- ◆GeneBench scientific reasoning: 28.5% (+3.2pp over GPT-5.4)
- ◆Price: $5/$30 — 2x GPT-5.4, but cached input 90% cheaper at $0.50/M
Плюсы
- State-of-the-art agentic coding (82.7% Terminal-Bench)
- 1M token context window
- Matches GPT-5.4 latency with fewer tokens
- Strong scientific research capability
Минусы
- 2x more expensive than GPT-5.4
- No fine-tuning support yet
- Output costs $30/M tokens
Производительность
Мультимодальность
Бенчмарки
GPT-5.5 Pro
ReasoningPremium accuracy, parallel compute
Когда использовать: Use when you need maximum accuracy for legal, medical, or scientific analysis and cost is secondary.
Ключевые улучшения
- ◆Parallel test-time compute: runs multiple reasoning paths, picks best
- ◆GeneBench (scientific reasoning): 33.2% — highest ever recorded
- ◆Dedicated GPU allocation for Pro/Business/Enterprise subscribers
- ◆Same 1M context + 128K output as GPT-5.5 base, but deeper reasoning
- ◆6x price premium ($30/$180) — justified only for mission-critical analysis
Плюсы
- Highest accuracy via parallel test-time compute
- 33.2% on GeneBench (scientific reasoning)
- Dedicated GPU allocation for Pro subscribers
Минусы
- 6x more expensive than GPT-5.5 base
- No batch API support
- Only for Pro/Business/Enterprise tiers
Производительность
Мультимодальность
Бенчмарки
GPT-5.4
FlagshipCoding, computer use, knowledge work
Когда использовать: Excellent all-rounder for coding assistants, desktop automation, and enterprise knowledge work.
Ключевые улучшения
- ◆OSWorld: 75% — first AI to exceed human-level desktop automation
- ◆33% fewer factual errors vs GPT-5.2 (hallucination reduction)
- ◆Tool Search: dynamically discovers tools, cuts agent token usage by ~40%
- ◆Context: 128K → 1M tokens (8x increase from GPT-4o)
- ◆Max output: 16K → 128K tokens (8x increase), price $2.50/$15
Плюсы
- First AI to exceed human desktop performance (75% OSWorld)
- Unified coding + computer use + knowledge work
- 33% fewer factual errors than GPT-5.2
- Tool Search reduces token usage for agents
Минусы
- More expensive than GPT-4.1
- No fine-tuning yet
- Input pricing doubles above 272K context
Производительность
Мультимодальность
Бенчмарки
GPT-5.4 Mini
Mid-tierCost-effective coding & dev tasks
Когда использовать: Best value for high-volume coding tasks, chatbots, and lighter development workloads.
Ключевые улучшения
- ◆SWE-bench Pro: 54.38% — only 3.3pp below GPT-5.4 Standard's 57.7%
- ◆6x cheaper than GPT-5.4 Standard ($0.75 vs $5/M input)
- ◆400K context window — sufficient for most coding tasks
- ◆Same 128K max output as flagship, 90% cached input savings
- ◆Released 12 days after Standard — fastest mini variant ever
Плюсы
- 54.38% SWE-bench Pro (close to Standard's 57.7%)
- 6x cheaper than GPT-5.4 Standard
- 400K context window
- Free tier access available
Минусы
- Lower reasoning quality than Standard
- Context window smaller than full 1M
- Released 12 days after Standard
Производительность
Мультимодальность
Бенчмарки
GPT-5.4 Nano
LiteEdge, embedded, mobile
Когда использовать: Ideal for on-device assistants, mobile apps, and scenarios where network latency is a concern.
Ключевые улучшения
- ◆Designed for edge/mobile/IoT — optimized for on-device inference
- ◆128K max output despite 272K context — highest output/context ratio
- ◆Vision + function calling at $0.20/M input — previously flagship-only
- ◆Latency optimized for real-time mobile interactions
- ◆Context 272K vs 1M for Standard — trade-off for speed and cost
Плюсы
- Ultra-low cost for edge deployments
- Designed for mobile and IoT devices
- 128K max output despite small size
Минусы
- Limited benchmarks available
- Smaller context window
- Quality gap for complex tasks
Производительность
Мультимодальность
Бенчмарки
GPT-4o
FlagshipGeneral purpose, multimodal
Когда использовать: Best default choice for most apps requiring high-quality multimodal output.
Ключевые улучшения
- ◆First natively multimodal model: text + image + audio in one model
- ◆2x faster than GPT-4 Turbo at 50% lower cost ($2.50 vs $5/M input)
- ◆Native audio understanding — no separate Whisper pipeline needed
- ◆Fine-tuning support for domain adaptation (flagship exclusive)
- ◆128K context — superseded by GPT-4.1's 1M for long-document tasks
Плюсы
- Multimodal (text + image + audio)
- Excellent general-purpose quality
- Strong function calling & JSON mode
Минусы
- Higher cost than mini/nano variants
- 128K context vs 1M+ for newer 4.1 models
Производительность
Мультимодальность
Бенчмарки
Агенты, использующие эту модель
3GPT-4o Mini
LiteFast, cost-efficient tasks
Когда использовать: Great for high-volume, cost-sensitive workloads like chatbots and summarization.
Ключевые улучшения
- ◆Replaced GPT-3.5 Turbo: 46% cheaper, significantly better quality
- ◆Same 128K context as GPT-4o at 1/17th the price
- ◆Scored 82% on MMLU — approaching GPT-4 level on knowledge tasks
- ◆Vision + function calling at $0.15/M — previously flagship-only features
- ◆Fine-tuning available — best cost/quality ratio for tuned models
Плюсы
- Extremely affordable
- Same 128K context as GPT-4o
- Supports vision + function calling
Минусы
- Lower reasoning quality than flagship
- Not ideal for complex multi-step tasks
Производительность
Мультимодальность
Бенчмарки
GPT-4.1
FlagshipCoding, instruction following
Когда использовать: Top pick for coding assistants and long-document analysis.
Ключевые улучшения
- ◆Context: 128K → 1M tokens (8x jump from GPT-4o)
- ◆SWE-bench Verified: 54.6% — best coding score at launch
- ◆Max output: 32K tokens (2x GPT-4o's 16K)
- ◆Instruction following: 40% better on IFEval vs GPT-4o
- ◆Cheaper than GPT-4o ($2 vs $2.50/M input) with more capability
Плюсы
- 1M token context window
- Best-in-class coding ability
- Cheaper than GPT-4o with better context
Минусы
- Slower than mini/nano for simple tasks
- Newer model, less battle-tested
Производительность
Мультимодальность
Бенчмарки
Агенты, использующие эту модель
81GPT-4.1 Mini
Mid-tierBalanced performance & cost
Когда использовать: Sweet spot for production apps needing long context without flagship costs.
Ключевые улучшения
- ◆1M context at $0.40/M input — 5x cheaper than GPT-4.1 flagship
- ◆Same 1M context + 32K output as flagship for production use
- ◆Full multimodal support (vision, function calling, fine-tuning)
- ◆SWE-bench: 42.8% — strong coding at mid-tier pricing
- ◆Replaced GPT-4o Mini for long-context production workloads
Плюсы
- 1M context at mid-tier price
- Strong coding for the price
- Full multimodal support
Минусы
- Quality gap vs flagship for complex reasoning
- Still more expensive than nano
Производительность
Мультимодальность
Бенчмарки
GPT-4.1 Nano
LiteUltra-low latency, edge devices
Когда использовать: Ideal for real-time classification, extraction, and high-throughput pipelines.
Ключевые улучшения
- ◆Cheapest model with 1M context: $0.10/M input (GPT-3.5 Turbo was $0.50)
- ◆Full feature set at lite tier: vision + function calling + fine-tuning
- ◆Context: 128K → 1M (8x increase over GPT-4o Mini's 128K)
- ◆Fastest response times in OpenAI lineup — sub-200ms typical latency
- ◆Cached input at $0.025/M — 97.5% savings for repeated prefixes
Плюсы
- Cheapest OpenAI model with 1M context
- Fastest response times
- Full feature set (vision, FC, fine-tuning)
Минусы
- Noticeable quality drop for complex tasks
- Not suitable for deep reasoning
Производительность
Мультимодальность
Бенчмарки
o3
ReasoningComplex reasoning, math, science
Когда использовать: Use for problems requiring step-by-step reasoning, math proofs, and scientific analysis.
Ключевые улучшения
- ◆ARC-AGI: 87.5% — near-human on novel abstract reasoning tasks
- ◆Max output: 100K tokens (4x o1's 25K) for deep chain-of-thought
- ◆Vision + function calling — new capabilities vs o1's text-only
- ◆Math Olympiad: Gold medal level on AIME 2024 (93.3%)
- ◆Price: $2/$8 — 1/3 the cost of o1 Pro at $200/M
Плюсы
- State-of-the-art reasoning ability
- 100K max output for long chain-of-thought
- Excellent at math and science
Минусы
- Slower due to thinking time
- No fine-tuning support
- Higher cost for reasoning tokens
Производительность
Мультимодальность
Бенчмарки
Агенты, использующие эту модель
1o4-mini
ReasoningReasoning at lower cost
Когда использовать: Best value for reasoning workloads — great for production where o3 is overkill.
Ключевые улучшения
- ◆55% cheaper than o3 ($1.10 vs $2/M input) for reasoning tasks
- ◆Same 100K max output as o3 — no compromise on reasoning depth
- ◆Faster inference than o3 with optimized thinking budget
- ◆Vision + function calling at reasoning tier — previously o3-only
- ◆Cached input: $0.275/M — 75% savings for repeated prompts
Плюсы
- 55% cheaper than o3 for reasoning
- Same 100K max output
- Faster than o3
Минусы
- Less capable than o3 on hardest problems
- No fine-tuning
- Reasoning tokens add hidden cost
Производительность
Мультимодальность
Бенчмарки
Агенты, использующие эту модель
1Сравнение бок о бок
| Модель | Уровень | Вход | Выход | Контекст |
|---|---|---|---|---|
| GPT-5.5 | Flagship | $5.00 | $30.00 | 1M |
| GPT-5.5 Pro | Reasoning | $30.00 | $180.00 | 1M |
| GPT-5.4 | Flagship | $2.50 | $15.00 | 1M |
| GPT-5.4 Mini | Mid-tier | $0.750 | $4.50 | 400K |
| GPT-5.4 Nano | Lite | $0.200 | $1.25 | 272K |
| GPT-4o | Flagship | $2.50 | $10.00 | 128K |
| GPT-4o Mini | Lite | $0.150 | $0.600 | 128K |
| GPT-4.1 | Flagship | $2.00 | $8.00 | 1.0M |
| GPT-4.1 Mini | Mid-tier | $0.400 | $1.60 | 1.0M |
| GPT-4.1 Nano | Lite | $0.100 | $0.400 | 1.0M |
| o3 | Reasoning | $2.00 | $8.00 | 200K |
| o4-mini | Reasoning | $1.10 | $4.40 | 200K |