OpenAI Models

Explore all 12 models from OpenAI with detailed pricing, pros & cons, and developer recommendations.

Models

$0.100

Lowest Input

1.0M

Max Context

Quality Tiers

Quick Recommendations

Best Value: GPT-4.1 Nano ($0.100/1M)

Best Quality: GPT-5.5

Best for Reasoning: GPT-5.5 Pro

GPT-5.5

Flagship

Agentic coding, complex workloads

Official Pricing

When to use: Best for complex coding agents, computer use, and professional workloads where accuracy matters most.

Upgrade Highlights

◆SWE-bench Verified: 82.7% (+6.5pp vs GPT-5.4's 76.2%)
◆Terminal-Bench: 82.7% — new SOTA for agentic coding
◆1M context window retained, 33% fewer tokens vs GPT-5.4 at same latency
◆GeneBench scientific reasoning: 28.5% (+3.2pp over GPT-5.4)
◆Price: $5/$30 — 2x GPT-5.4, but cached input 90% cheaper at $0.50/M

Input Price

$5.00

per 1M tokens

Output Price

$30.00

per 1M tokens

Cached Input

$0.500

per 1M tokens

Batch Input

$2.50

per 1M tokens

Context Window: 1M

Max Output: 128,000 tokens

Knowledge Cutoff: 2025-06

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

State-of-the-art agentic coding (82.7% Terminal-Bench)
1M token context window
Matches GPT-5.4 latency with fewer tokens
Strong scientific research capability

Cons

2x more expensive than GPT-5.4
No fine-tuning support yet
Output costs $30/M tokens

Performance

Output Speed~85 tok/s

Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

90.0%

SWE-bench Verified

82.7%

Terminal-Bench

82.7%

GeneBench

28.5%

GPT-5.5 Pro

Reasoning

Premium accuracy, parallel compute

Official Pricing

When to use: Use when you need maximum accuracy for legal, medical, or scientific analysis and cost is secondary.

Upgrade Highlights

◆Parallel test-time compute: runs multiple reasoning paths, picks best
◆GeneBench (scientific reasoning): 33.2% — highest ever recorded
◆Dedicated GPU allocation for Pro/Business/Enterprise subscribers
◆Same 1M context + 128K output as GPT-5.5 base, but deeper reasoning
◆6x price premium ($30/$180) — justified only for mission-critical analysis

Input Price

$30.00

per 1M tokens

Output Price

$180.00

per 1M tokens

Cached Input

—

per 1M tokens

Batch Input

—

per 1M tokens

Context Window: 1M

Max Output: 128,000 tokens

Knowledge Cutoff: 2025-06

VisionFunction CallingFine-tuningJSON Mode

Pros

Highest accuracy via parallel test-time compute
33.2% on GeneBench (scientific reasoning)
Dedicated GPU allocation for Pro subscribers

Cons

6x more expensive than GPT-5.5 base
No batch API support
Only for Pro/Business/Enterprise tiers

Performance

Output Speed~40 tok/s

Rate Limit3,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

90.5%

GeneBench

33.2%

SWE-bench Verified

79.0%

GPT-5.4

Flagship

Coding, computer use, knowledge work

Official Pricing

When to use: Excellent all-rounder for coding assistants, desktop automation, and enterprise knowledge work.

Upgrade Highlights

◆OSWorld: 75% — first AI to exceed human-level desktop automation
◆33% fewer factual errors vs GPT-5.2 (hallucination reduction)
◆Tool Search: dynamically discovers tools, cuts agent token usage by ~40%
◆Context: 128K → 1M tokens (8x increase from GPT-4o)
◆Max output: 16K → 128K tokens (8x increase), price $2.50/$15

Input Price

$2.50

per 1M tokens

Output Price

$15.00

per 1M tokens

Cached Input

$0.250

per 1M tokens

Batch Input

$1.25

per 1M tokens

Context Window: 1M

Max Output: 128,000 tokens

Knowledge Cutoff: 2025-03

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

First AI to exceed human desktop performance (75% OSWorld)
Unified coding + computer use + knowledge work
33% fewer factual errors than GPT-5.2
Tool Search reduces token usage for agents

Cons

More expensive than GPT-4.1
No fine-tuning yet
Input pricing doubles above 272K context

Performance

Output Speed~90 tok/s

Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

89.5%

SWE-bench Verified

76.2%

OSWorld

75.0%

IFEval

87.0%

GPT-5.4 Mini

Mid-tier

Cost-effective coding & dev tasks

Official Pricing

When to use: Best value for high-volume coding tasks, chatbots, and lighter development workloads.

Upgrade Highlights

◆SWE-bench Pro: 54.38% — only 3.3pp below GPT-5.4 Standard's 57.7%
◆6x cheaper than GPT-5.4 Standard ($0.75 vs $5/M input)
◆400K context window — sufficient for most coding tasks
◆Same 128K max output as flagship, 90% cached input savings
◆Released 12 days after Standard — fastest mini variant ever

Input Price

$0.750

per 1M tokens

Output Price

$4.50

per 1M tokens

Cached Input

$0.075

per 1M tokens

Batch Input

$0.375

per 1M tokens

Context Window: 400K

Max Output: 128,000 tokens

Knowledge Cutoff: 2025-03

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

54.38% SWE-bench Pro (close to Standard's 57.7%)
6x cheaper than GPT-5.4 Standard
400K context window
Free tier access available

Cons

Lower reasoning quality than Standard
Context window smaller than full 1M
Released 12 days after Standard

Performance

Output Speed~120 tok/s

Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

SWE-bench Pro

54.38%

MMLU

86.0%

HumanEval

88.2%

GPT-5.4 Nano

Lite

Edge, embedded, mobile

Official Pricing

When to use: Ideal for on-device assistants, mobile apps, and scenarios where network latency is a concern.

Upgrade Highlights

◆Designed for edge/mobile/IoT — optimized for on-device inference
◆128K max output despite 272K context — highest output/context ratio
◆Vision + function calling at $0.20/M input — previously flagship-only
◆Latency optimized for real-time mobile interactions
◆Context 272K vs 1M for Standard — trade-off for speed and cost

Input Price

$0.200

per 1M tokens

Output Price

$1.25

per 1M tokens

Cached Input

—

per 1M tokens

Batch Input

—

per 1M tokens

Context Window: 272K

Max Output: 128,000 tokens

Knowledge Cutoff: 2025-03

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

Ultra-low cost for edge deployments
Designed for mobile and IoT devices
128K max output despite small size

Cons

Limited benchmarks available
Smaller context window
Quality gap for complex tasks

Performance

Output Speed~150 tok/s

Rate Limit15,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

78.5%

HumanEval

76.0%

GPT-4o

Flagship

General purpose, multimodal

Official Pricing

When to use: Best default choice for most apps requiring high-quality multimodal output.

Upgrade Highlights

◆First natively multimodal model: text + image + audio in one model
◆2x faster than GPT-4 Turbo at 50% lower cost ($2.50 vs $5/M input)
◆Native audio understanding — no separate Whisper pipeline needed
◆Fine-tuning support for domain adaptation (flagship exclusive)
◆128K context — superseded by GPT-4.1's 1M for long-document tasks

Input Price

$2.50

per 1M tokens

Output Price

$10.00

per 1M tokens

Cached Input

$1.25

per 1M tokens

Batch Input

$1.25

per 1M tokens

Context Window: 128K

Max Output: 16,384 tokens

Knowledge Cutoff: 2023-10

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

Multimodal (text + image + audio)
Excellent general-purpose quality
Strong function calling & JSON mode

Cons

Higher cost than mini/nano variants
128K context vs 1M+ for newer 4.1 models

Performance

Output Speed~75 tok/s

Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

88.7%

HumanEval

90.2%

MATH

76.6%

IFEval

80.4%

Agents Using This Model

AutoGen SuperAGI OpenAI Agents SDK

GPT-4o Mini

Lite

Fast, cost-efficient tasks

Official Pricing

When to use: Great for high-volume, cost-sensitive workloads like chatbots and summarization.

Upgrade Highlights

◆Replaced GPT-3.5 Turbo: 46% cheaper, significantly better quality
◆Same 128K context as GPT-4o at 1/17th the price
◆Scored 82% on MMLU — approaching GPT-4 level on knowledge tasks
◆Vision + function calling at $0.15/M — previously flagship-only features
◆Fine-tuning available — best cost/quality ratio for tuned models

Input Price

$0.150

per 1M tokens

Output Price

$0.600

per 1M tokens

Cached Input

$0.075

per 1M tokens

Batch Input

$0.075

per 1M tokens

Context Window: 128K

Max Output: 16,384 tokens

Knowledge Cutoff: 2023-10

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

Extremely affordable
Same 128K context as GPT-4o
Supports vision + function calling

Cons

Lower reasoning quality than flagship
Not ideal for complex multi-step tasks

Performance

Output Speed~130 tok/s

Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

82.0%

HumanEval

87.2%

MATH

67.5%

GPT-4.1

Flagship

Coding, instruction following

Official Pricing

When to use: Top pick for coding assistants and long-document analysis.

Upgrade Highlights

◆Context: 128K → 1M tokens (8x jump from GPT-4o)
◆SWE-bench Verified: 54.6% — best coding score at launch
◆Max output: 32K tokens (2x GPT-4o's 16K)
◆Instruction following: 40% better on IFEval vs GPT-4o
◆Cheaper than GPT-4o ($2 vs $2.50/M input) with more capability

Input Price

$2.00

per 1M tokens

Output Price

$8.00

per 1M tokens

Cached Input

$0.500

per 1M tokens

Batch Input

$1.00

per 1M tokens

Context Window: 1.0M

Max Output: 32,768 tokens

Knowledge Cutoff: 2024-04

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

1M token context window
Best-in-class coding ability
Cheaper than GPT-4o with better context

Cons

Slower than mini/nano for simple tasks
Newer model, less battle-tested

Performance

Output Speed~70 tok/s

Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

88.3%

SWE-bench Verified

54.6%

IFEval

84.0%

HumanEval

92.0%

Agents Using This Model

GitHub Copilot Agent Cursor Agent Bolt.new SWE-Agent Perplexity Deep Research CrewAI LangGraph AutoGen SuperAGI OpenAI Agents SDK Lindy.ai Relevance AI OpenAI Operator Browser Use LaVague Intercom Fin Sierra AI Ada AI Drift (Salesloft)Replit Agent Codeium (Windsurf)Sourcegraph Cody v0 by Vercel Phind Aider Cline StackBlitz AI Qodo (CodiumAI)Elicit You.com AI Consensus NLP Iris.ai STORM by Stanford Microsoft Copilot Studio AgentGPT AutoGPT BabyAGI MetaGPT ChatDev Agency Swarm Flowise n8n AI Haystack Marvin AI DSPy Guidance AI Notion AI HubSpot AI Agent Salesforce Einstein Relay.app Activepieces Windmill Skyvern Agentforce Browser Playwright MCP Agent Zendesk AI Tidio AI Freshdesk Freddy AI Forethought Ultimate (Zendesk)Voiceflow Landbot AI Quickchat AI Kore.ai Parloa Julius AI Rows AI Akkio Hex AI Polymer AI Tableau AI (Agentforce)Levity AI Jasper AI Copy.ai Writesonic Frase Surfer SEO Anyword Peppertype.ai Simplified AI InVideo AI

GPT-4.1 Mini

Mid-tier

Balanced performance & cost

Official Pricing

When to use: Sweet spot for production apps needing long context without flagship costs.

Upgrade Highlights

◆1M context at $0.40/M input — 5x cheaper than GPT-4.1 flagship
◆Same 1M context + 32K output as flagship for production use
◆Full multimodal support (vision, function calling, fine-tuning)
◆SWE-bench: 42.8% — strong coding at mid-tier pricing
◆Replaced GPT-4o Mini for long-context production workloads

Input Price

$0.400

per 1M tokens

Output Price

$1.60

per 1M tokens

Cached Input

$0.100

per 1M tokens

Batch Input

$0.200

per 1M tokens

Context Window: 1.0M

Max Output: 32,768 tokens

Knowledge Cutoff: 2024-04

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

1M context at mid-tier price
Strong coding for the price
Full multimodal support

Cons

Quality gap vs flagship for complex reasoning
Still more expensive than nano

Performance

Output Speed~110 tok/s

Rate Limit10,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

84.5%

SWE-bench Verified

42.8%

HumanEval

86.5%

GPT-4.1 Nano

Lite

Ultra-low latency, edge devices

Official Pricing

When to use: Ideal for real-time classification, extraction, and high-throughput pipelines.

Upgrade Highlights

◆Cheapest model with 1M context: $0.10/M input (GPT-3.5 Turbo was $0.50)
◆Full feature set at lite tier: vision + function calling + fine-tuning
◆Context: 128K → 1M (8x increase over GPT-4o Mini's 128K)
◆Fastest response times in OpenAI lineup — sub-200ms typical latency
◆Cached input at $0.025/M — 97.5% savings for repeated prefixes

Input Price

$0.100

per 1M tokens

Output Price

$0.400

per 1M tokens

Cached Input

$0.025

per 1M tokens

Batch Input

$0.050

per 1M tokens

Context Window: 1.0M

Max Output: 32,768 tokens

Knowledge Cutoff: 2024-04

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

Cheapest OpenAI model with 1M context
Fastest response times
Full feature set (vision, FC, fine-tuning)

Cons

Noticeable quality drop for complex tasks
Not suitable for deep reasoning

Performance

Output Speed~160 tok/s

Rate Limit15,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

76.0%

HumanEval

72.8%

o3

Reasoning

Complex reasoning, math, science

Official Pricing

When to use: Use for problems requiring step-by-step reasoning, math proofs, and scientific analysis.

Upgrade Highlights

◆ARC-AGI: 87.5% — near-human on novel abstract reasoning tasks
◆Max output: 100K tokens (4x o1's 25K) for deep chain-of-thought
◆Vision + function calling — new capabilities vs o1's text-only
◆Math Olympiad: Gold medal level on AIME 2024 (93.3%)
◆Price: $2/$8 — 1/3 the cost of o1 Pro at $200/M

Input Price

$2.00

per 1M tokens

Output Price

$8.00

per 1M tokens

Cached Input

$0.500

per 1M tokens

Batch Input

$1.00

per 1M tokens

Context Window: 200K

Max Output: 100,000 tokens

Knowledge Cutoff: 2024-05

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

State-of-the-art reasoning ability
100K max output for long chain-of-thought
Excellent at math and science

Cons

Slower due to thinking time
No fine-tuning support
Higher cost for reasoning tokens

Performance

Output Speed~25 tok/s

Rate Limit3,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

ARC-AGI

87.5%

AIME 2024

93.3%

MATH

96.7%

GPQA

83.3%

Agents Using This Model

OpenAI Deep Research

o4-mini

Reasoning

Reasoning at lower cost

Official Pricing

When to use: Best value for reasoning workloads — great for production where o3 is overkill.

Upgrade Highlights

◆55% cheaper than o3 ($1.10 vs $2/M input) for reasoning tasks
◆Same 100K max output as o3 — no compromise on reasoning depth
◆Faster inference than o3 with optimized thinking budget
◆Vision + function calling at reasoning tier — previously o3-only
◆Cached input: $0.275/M — 75% savings for repeated prompts

Input Price

$1.10

per 1M tokens

Output Price

$4.40

per 1M tokens

Cached Input

$0.275

per 1M tokens

Batch Input

$0.550

per 1M tokens

Context Window: 200K

Max Output: 100,000 tokens

Knowledge Cutoff: 2024-05

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

55% cheaper than o3 for reasoning
Same 100K max output
Faster than o3

Cons

Less capable than o3 on hardest problems
No fine-tuning
Reasoning tokens add hidden cost

Performance

Output Speed~35 tok/s

Rate Limit5,000 RPM

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

AIME 2024

90.0%

MATH

92.5%

GPQA

78.1%

Agents Using This Model

OpenAI Codex

Side-by-Side Comparison

Model	Tier	Input	Output	Cached	Context	Max Output
GPT-5.5	Flagship	$5.00	$30.00	$0.500	1M	128,000
GPT-5.5 Pro	Reasoning	$30.00	$180.00	—	1M	128,000
GPT-5.4	Flagship	$2.50	$15.00	$0.250	1M	128,000
GPT-5.4 Mini	Mid-tier	$0.750	$4.50	$0.075	400K	128,000
GPT-5.4 Nano	Lite	$0.200	$1.25	—	272K	128,000
GPT-4o	Flagship	$2.50	$10.00	$1.25	128K	16,384
GPT-4o Mini	Lite	$0.150	$0.600	$0.075	128K	16,384
GPT-4.1	Flagship	$2.00	$8.00	$0.500	1.0M	32,768
GPT-4.1 Mini	Mid-tier	$0.400	$1.60	$0.100	1.0M	32,768
GPT-4.1 Nano	Lite	$0.100	$0.400	$0.025	1.0M	32,768
o3	Reasoning	$2.00	$8.00	$0.500	200K	100,000
o4-mini	Reasoning	$1.10	$4.40	$0.275	200K	100,000