Back to Developer Zone

OpenAI Models

Explore all 12 models from OpenAI with detailed pricing, pros & cons, and developer recommendations.

12
Models
$0.100
Lowest Input
1.0M
Max Context
4
Quality Tiers

Quick Recommendations

Best Value: GPT-4.1 Nano ($0.100/1M)
Best Quality: GPT-5.5
Best for Reasoning: GPT-5.5 Pro

GPT-5.5

Flagship

Agentic coding, complex workloads

Official Pricing

When to use: Best for complex coding agents, computer use, and professional workloads where accuracy matters most.

Upgrade Highlights

  • SWE-bench Verified: 82.7% (+6.5pp vs GPT-5.4's 76.2%)
  • Terminal-Bench: 82.7% — new SOTA for agentic coding
  • 1M context window retained, 33% fewer tokens vs GPT-5.4 at same latency
  • GeneBench scientific reasoning: 28.5% (+3.2pp over GPT-5.4)
  • Price: $5/$30 — 2x GPT-5.4, but cached input 90% cheaper at $0.50/M
Input Price
$5.00
per 1M tokens
Output Price
$30.00
per 1M tokens
Cached Input
$0.500
per 1M tokens
Batch Input
$2.50
per 1M tokens
Context Window: 1M
Max Output: 128,000 tokens
Knowledge Cutoff: 2025-06
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • State-of-the-art agentic coding (82.7% Terminal-Bench)
  • 1M token context window
  • Matches GPT-5.4 latency with fewer tokens
  • Strong scientific research capability

Cons

  • 2x more expensive than GPT-5.4
  • No fine-tuning support yet
  • Output costs $30/M tokens

Performance

Output Speed~85 tok/s
Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
90.0%
SWE-bench Verified
82.7%
Terminal-Bench
82.7%
GeneBench
28.5%

GPT-5.5 Pro

Reasoning

Premium accuracy, parallel compute

Official Pricing

When to use: Use when you need maximum accuracy for legal, medical, or scientific analysis and cost is secondary.

Upgrade Highlights

  • Parallel test-time compute: runs multiple reasoning paths, picks best
  • GeneBench (scientific reasoning): 33.2% — highest ever recorded
  • Dedicated GPU allocation for Pro/Business/Enterprise subscribers
  • Same 1M context + 128K output as GPT-5.5 base, but deeper reasoning
  • 6x price premium ($30/$180) — justified only for mission-critical analysis
Input Price
$30.00
per 1M tokens
Output Price
$180.00
per 1M tokens
Cached Input
per 1M tokens
Batch Input
per 1M tokens
Context Window: 1M
Max Output: 128,000 tokens
Knowledge Cutoff: 2025-06
VisionFunction CallingFine-tuningJSON Mode

Pros

  • Highest accuracy via parallel test-time compute
  • 33.2% on GeneBench (scientific reasoning)
  • Dedicated GPU allocation for Pro subscribers

Cons

  • 6x more expensive than GPT-5.5 base
  • No batch API support
  • Only for Pro/Business/Enterprise tiers

Performance

Output Speed~40 tok/s
Rate Limit3,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
90.5%
GeneBench
33.2%
SWE-bench Verified
79.0%

GPT-5.4

Flagship

Coding, computer use, knowledge work

Official Pricing

When to use: Excellent all-rounder for coding assistants, desktop automation, and enterprise knowledge work.

Upgrade Highlights

  • OSWorld: 75% — first AI to exceed human-level desktop automation
  • 33% fewer factual errors vs GPT-5.2 (hallucination reduction)
  • Tool Search: dynamically discovers tools, cuts agent token usage by ~40%
  • Context: 128K → 1M tokens (8x increase from GPT-4o)
  • Max output: 16K → 128K tokens (8x increase), price $2.50/$15
Input Price
$2.50
per 1M tokens
Output Price
$15.00
per 1M tokens
Cached Input
$0.250
per 1M tokens
Batch Input
$1.25
per 1M tokens
Context Window: 1M
Max Output: 128,000 tokens
Knowledge Cutoff: 2025-03
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • First AI to exceed human desktop performance (75% OSWorld)
  • Unified coding + computer use + knowledge work
  • 33% fewer factual errors than GPT-5.2
  • Tool Search reduces token usage for agents

Cons

  • More expensive than GPT-4.1
  • No fine-tuning yet
  • Input pricing doubles above 272K context

Performance

Output Speed~90 tok/s
Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
89.5%
SWE-bench Verified
76.2%
OSWorld
75.0%
IFEval
87.0%

GPT-5.4 Mini

Mid-tier

Cost-effective coding & dev tasks

Official Pricing

When to use: Best value for high-volume coding tasks, chatbots, and lighter development workloads.

Upgrade Highlights

  • SWE-bench Pro: 54.38% — only 3.3pp below GPT-5.4 Standard's 57.7%
  • 6x cheaper than GPT-5.4 Standard ($0.75 vs $5/M input)
  • 400K context window — sufficient for most coding tasks
  • Same 128K max output as flagship, 90% cached input savings
  • Released 12 days after Standard — fastest mini variant ever
Input Price
$0.750
per 1M tokens
Output Price
$4.50
per 1M tokens
Cached Input
$0.075
per 1M tokens
Batch Input
$0.375
per 1M tokens
Context Window: 400K
Max Output: 128,000 tokens
Knowledge Cutoff: 2025-03
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • 54.38% SWE-bench Pro (close to Standard's 57.7%)
  • 6x cheaper than GPT-5.4 Standard
  • 400K context window
  • Free tier access available

Cons

  • Lower reasoning quality than Standard
  • Context window smaller than full 1M
  • Released 12 days after Standard

Performance

Output Speed~120 tok/s
Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

SWE-bench Pro
54.38%
MMLU
86.0%
HumanEval
88.2%

GPT-5.4 Nano

Lite

Edge, embedded, mobile

Official Pricing

When to use: Ideal for on-device assistants, mobile apps, and scenarios where network latency is a concern.

Upgrade Highlights

  • Designed for edge/mobile/IoT — optimized for on-device inference
  • 128K max output despite 272K context — highest output/context ratio
  • Vision + function calling at $0.20/M input — previously flagship-only
  • Latency optimized for real-time mobile interactions
  • Context 272K vs 1M for Standard — trade-off for speed and cost
Input Price
$0.200
per 1M tokens
Output Price
$1.25
per 1M tokens
Cached Input
per 1M tokens
Batch Input
per 1M tokens
Context Window: 272K
Max Output: 128,000 tokens
Knowledge Cutoff: 2025-03
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • Ultra-low cost for edge deployments
  • Designed for mobile and IoT devices
  • 128K max output despite small size

Cons

  • Limited benchmarks available
  • Smaller context window
  • Quality gap for complex tasks

Performance

Output Speed~150 tok/s
Rate Limit15,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
78.5%
HumanEval
76.0%

GPT-4o

Flagship

General purpose, multimodal

Official Pricing

When to use: Best default choice for most apps requiring high-quality multimodal output.

Upgrade Highlights

  • First natively multimodal model: text + image + audio in one model
  • 2x faster than GPT-4 Turbo at 50% lower cost ($2.50 vs $5/M input)
  • Native audio understanding — no separate Whisper pipeline needed
  • Fine-tuning support for domain adaptation (flagship exclusive)
  • 128K context — superseded by GPT-4.1's 1M for long-document tasks
Input Price
$2.50
per 1M tokens
Output Price
$10.00
per 1M tokens
Cached Input
$1.25
per 1M tokens
Batch Input
$1.25
per 1M tokens
Context Window: 128K
Max Output: 16,384 tokens
Knowledge Cutoff: 2023-10
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • Multimodal (text + image + audio)
  • Excellent general-purpose quality
  • Strong function calling & JSON mode

Cons

  • Higher cost than mini/nano variants
  • 128K context vs 1M+ for newer 4.1 models

Performance

Output Speed~75 tok/s
Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
88.7%
HumanEval
90.2%
MATH
76.6%
IFEval
80.4%

Agents Using This Model

3

GPT-4o Mini

Lite

Fast, cost-efficient tasks

Official Pricing

When to use: Great for high-volume, cost-sensitive workloads like chatbots and summarization.

Upgrade Highlights

  • Replaced GPT-3.5 Turbo: 46% cheaper, significantly better quality
  • Same 128K context as GPT-4o at 1/17th the price
  • Scored 82% on MMLU — approaching GPT-4 level on knowledge tasks
  • Vision + function calling at $0.15/M — previously flagship-only features
  • Fine-tuning available — best cost/quality ratio for tuned models
Input Price
$0.150
per 1M tokens
Output Price
$0.600
per 1M tokens
Cached Input
$0.075
per 1M tokens
Batch Input
$0.075
per 1M tokens
Context Window: 128K
Max Output: 16,384 tokens
Knowledge Cutoff: 2023-10
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • Extremely affordable
  • Same 128K context as GPT-4o
  • Supports vision + function calling

Cons

  • Lower reasoning quality than flagship
  • Not ideal for complex multi-step tasks

Performance

Output Speed~130 tok/s
Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
82.0%
HumanEval
87.2%
MATH
67.5%

GPT-4.1

Flagship

Coding, instruction following

Official Pricing

When to use: Top pick for coding assistants and long-document analysis.

Upgrade Highlights

  • Context: 128K → 1M tokens (8x jump from GPT-4o)
  • SWE-bench Verified: 54.6% — best coding score at launch
  • Max output: 32K tokens (2x GPT-4o's 16K)
  • Instruction following: 40% better on IFEval vs GPT-4o
  • Cheaper than GPT-4o ($2 vs $2.50/M input) with more capability
Input Price
$2.00
per 1M tokens
Output Price
$8.00
per 1M tokens
Cached Input
$0.500
per 1M tokens
Batch Input
$1.00
per 1M tokens
Context Window: 1.0M
Max Output: 32,768 tokens
Knowledge Cutoff: 2024-04
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • 1M token context window
  • Best-in-class coding ability
  • Cheaper than GPT-4o with better context

Cons

  • Slower than mini/nano for simple tasks
  • Newer model, less battle-tested

Performance

Output Speed~70 tok/s
Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
88.3%
SWE-bench Verified
54.6%
IFEval
84.0%
HumanEval
92.0%

GPT-4.1 Mini

Mid-tier

Balanced performance & cost

Official Pricing

When to use: Sweet spot for production apps needing long context without flagship costs.

Upgrade Highlights

  • 1M context at $0.40/M input — 5x cheaper than GPT-4.1 flagship
  • Same 1M context + 32K output as flagship for production use
  • Full multimodal support (vision, function calling, fine-tuning)
  • SWE-bench: 42.8% — strong coding at mid-tier pricing
  • Replaced GPT-4o Mini for long-context production workloads
Input Price
$0.400
per 1M tokens
Output Price
$1.60
per 1M tokens
Cached Input
$0.100
per 1M tokens
Batch Input
$0.200
per 1M tokens
Context Window: 1.0M
Max Output: 32,768 tokens
Knowledge Cutoff: 2024-04
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • 1M context at mid-tier price
  • Strong coding for the price
  • Full multimodal support

Cons

  • Quality gap vs flagship for complex reasoning
  • Still more expensive than nano

Performance

Output Speed~110 tok/s
Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
84.5%
SWE-bench Verified
42.8%
HumanEval
86.5%

GPT-4.1 Nano

Lite

Ultra-low latency, edge devices

Official Pricing

When to use: Ideal for real-time classification, extraction, and high-throughput pipelines.

Upgrade Highlights

  • Cheapest model with 1M context: $0.10/M input (GPT-3.5 Turbo was $0.50)
  • Full feature set at lite tier: vision + function calling + fine-tuning
  • Context: 128K → 1M (8x increase over GPT-4o Mini's 128K)
  • Fastest response times in OpenAI lineup — sub-200ms typical latency
  • Cached input at $0.025/M — 97.5% savings for repeated prefixes
Input Price
$0.100
per 1M tokens
Output Price
$0.400
per 1M tokens
Cached Input
$0.025
per 1M tokens
Batch Input
$0.050
per 1M tokens
Context Window: 1.0M
Max Output: 32,768 tokens
Knowledge Cutoff: 2024-04
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • Cheapest OpenAI model with 1M context
  • Fastest response times
  • Full feature set (vision, FC, fine-tuning)

Cons

  • Noticeable quality drop for complex tasks
  • Not suitable for deep reasoning

Performance

Output Speed~160 tok/s
Rate Limit15,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU
76.0%
HumanEval
72.8%

o3

Reasoning

Complex reasoning, math, science

Official Pricing

When to use: Use for problems requiring step-by-step reasoning, math proofs, and scientific analysis.

Upgrade Highlights

  • ARC-AGI: 87.5% — near-human on novel abstract reasoning tasks
  • Max output: 100K tokens (4x o1's 25K) for deep chain-of-thought
  • Vision + function calling — new capabilities vs o1's text-only
  • Math Olympiad: Gold medal level on AIME 2024 (93.3%)
  • Price: $2/$8 — 1/3 the cost of o1 Pro at $200/M
Input Price
$2.00
per 1M tokens
Output Price
$8.00
per 1M tokens
Cached Input
$0.500
per 1M tokens
Batch Input
$1.00
per 1M tokens
Context Window: 200K
Max Output: 100,000 tokens
Knowledge Cutoff: 2024-05
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • State-of-the-art reasoning ability
  • 100K max output for long chain-of-thought
  • Excellent at math and science

Cons

  • Slower due to thinking time
  • No fine-tuning support
  • Higher cost for reasoning tokens

Performance

Output Speed~25 tok/s
Rate Limit3,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

ARC-AGI
87.5%
AIME 2024
93.3%
MATH
96.7%
GPQA
83.3%

Agents Using This Model

1

o4-mini

Reasoning

Reasoning at lower cost

Official Pricing

When to use: Best value for reasoning workloads — great for production where o3 is overkill.

Upgrade Highlights

  • 55% cheaper than o3 ($1.10 vs $2/M input) for reasoning tasks
  • Same 100K max output as o3 — no compromise on reasoning depth
  • Faster inference than o3 with optimized thinking budget
  • Vision + function calling at reasoning tier — previously o3-only
  • Cached input: $0.275/M — 75% savings for repeated prompts
Input Price
$1.10
per 1M tokens
Output Price
$4.40
per 1M tokens
Cached Input
$0.275
per 1M tokens
Batch Input
$0.550
per 1M tokens
Context Window: 200K
Max Output: 100,000 tokens
Knowledge Cutoff: 2024-05
VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

  • 55% cheaper than o3 for reasoning
  • Same 100K max output
  • Faster than o3

Cons

  • Less capable than o3 on hardest problems
  • No fine-tuning
  • Reasoning tokens add hidden cost

Performance

Output Speed~35 tok/s
Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

AIME 2024
90.0%
MATH
92.5%
GPQA
78.1%

Agents Using This Model

1

Side-by-Side Comparison

ModelTierInputOutputContext
GPT-5.5Flagship$5.00$30.001M
GPT-5.5 ProReasoning$30.00$180.001M
GPT-5.4Flagship$2.50$15.001M
GPT-5.4 MiniMid-tier$0.750$4.50400K
GPT-5.4 NanoLite$0.200$1.25272K
GPT-4oFlagship$2.50$10.00128K
GPT-4o MiniLite$0.150$0.600128K
GPT-4.1Flagship$2.00$8.001.0M
GPT-4.1 MiniMid-tier$0.400$1.601.0M
GPT-4.1 NanoLite$0.100$0.4001.0M
o3Reasoning$2.00$8.00200K
o4-miniReasoning$1.10$4.40200K