OpenAI النماذج
استكشف جميع النماذج البالغ عددها 12 من OpenAI مع تفاصيل الأسعار والمزايا والعيوب وتوصيات المطورين.
توصيات سريعة
GPT-5.5
FlagshipAgentic coding, complex workloads
متى تستخدم: Best for complex coding agents, computer use, and professional workloads where accuracy matters most.
أبرز التحديثات
- ◆SWE-bench Verified: 82.7% (+6.5pp vs GPT-5.4's 76.2%)
- ◆Terminal-Bench: 82.7% — new SOTA for agentic coding
- ◆1M context window retained, 33% fewer tokens vs GPT-5.4 at same latency
- ◆GeneBench scientific reasoning: 28.5% (+3.2pp over GPT-5.4)
- ◆Price: $5/$30 — 2x GPT-5.4, but cached input 90% cheaper at $0.50/M
المزايا
- State-of-the-art agentic coding (82.7% Terminal-Bench)
- 1M token context window
- Matches GPT-5.4 latency with fewer tokens
- Strong scientific research capability
العيوب
- 2x more expensive than GPT-5.4
- No fine-tuning support yet
- Output costs $30/M tokens
الأداء
متعدد الوسائط
المعايير
GPT-5.5 Pro
ReasoningPremium accuracy, parallel compute
متى تستخدم: Use when you need maximum accuracy for legal, medical, or scientific analysis and cost is secondary.
أبرز التحديثات
- ◆Parallel test-time compute: runs multiple reasoning paths, picks best
- ◆GeneBench (scientific reasoning): 33.2% — highest ever recorded
- ◆Dedicated GPU allocation for Pro/Business/Enterprise subscribers
- ◆Same 1M context + 128K output as GPT-5.5 base, but deeper reasoning
- ◆6x price premium ($30/$180) — justified only for mission-critical analysis
المزايا
- Highest accuracy via parallel test-time compute
- 33.2% on GeneBench (scientific reasoning)
- Dedicated GPU allocation for Pro subscribers
العيوب
- 6x more expensive than GPT-5.5 base
- No batch API support
- Only for Pro/Business/Enterprise tiers
الأداء
متعدد الوسائط
المعايير
GPT-5.4
FlagshipCoding, computer use, knowledge work
متى تستخدم: Excellent all-rounder for coding assistants, desktop automation, and enterprise knowledge work.
أبرز التحديثات
- ◆OSWorld: 75% — first AI to exceed human-level desktop automation
- ◆33% fewer factual errors vs GPT-5.2 (hallucination reduction)
- ◆Tool Search: dynamically discovers tools, cuts agent token usage by ~40%
- ◆Context: 128K → 1M tokens (8x increase from GPT-4o)
- ◆Max output: 16K → 128K tokens (8x increase), price $2.50/$15
المزايا
- First AI to exceed human desktop performance (75% OSWorld)
- Unified coding + computer use + knowledge work
- 33% fewer factual errors than GPT-5.2
- Tool Search reduces token usage for agents
العيوب
- More expensive than GPT-4.1
- No fine-tuning yet
- Input pricing doubles above 272K context
الأداء
متعدد الوسائط
المعايير
GPT-5.4 Mini
Mid-tierCost-effective coding & dev tasks
متى تستخدم: Best value for high-volume coding tasks, chatbots, and lighter development workloads.
أبرز التحديثات
- ◆SWE-bench Pro: 54.38% — only 3.3pp below GPT-5.4 Standard's 57.7%
- ◆6x cheaper than GPT-5.4 Standard ($0.75 vs $5/M input)
- ◆400K context window — sufficient for most coding tasks
- ◆Same 128K max output as flagship, 90% cached input savings
- ◆Released 12 days after Standard — fastest mini variant ever
المزايا
- 54.38% SWE-bench Pro (close to Standard's 57.7%)
- 6x cheaper than GPT-5.4 Standard
- 400K context window
- Free tier access available
العيوب
- Lower reasoning quality than Standard
- Context window smaller than full 1M
- Released 12 days after Standard
الأداء
متعدد الوسائط
المعايير
GPT-5.4 Nano
LiteEdge, embedded, mobile
متى تستخدم: Ideal for on-device assistants, mobile apps, and scenarios where network latency is a concern.
أبرز التحديثات
- ◆Designed for edge/mobile/IoT — optimized for on-device inference
- ◆128K max output despite 272K context — highest output/context ratio
- ◆Vision + function calling at $0.20/M input — previously flagship-only
- ◆Latency optimized for real-time mobile interactions
- ◆Context 272K vs 1M for Standard — trade-off for speed and cost
المزايا
- Ultra-low cost for edge deployments
- Designed for mobile and IoT devices
- 128K max output despite small size
العيوب
- Limited benchmarks available
- Smaller context window
- Quality gap for complex tasks
الأداء
متعدد الوسائط
المعايير
GPT-4o
FlagshipGeneral purpose, multimodal
متى تستخدم: Best default choice for most apps requiring high-quality multimodal output.
أبرز التحديثات
- ◆First natively multimodal model: text + image + audio in one model
- ◆2x faster than GPT-4 Turbo at 50% lower cost ($2.50 vs $5/M input)
- ◆Native audio understanding — no separate Whisper pipeline needed
- ◆Fine-tuning support for domain adaptation (flagship exclusive)
- ◆128K context — superseded by GPT-4.1's 1M for long-document tasks
المزايا
- Multimodal (text + image + audio)
- Excellent general-purpose quality
- Strong function calling & JSON mode
العيوب
- Higher cost than mini/nano variants
- 128K context vs 1M+ for newer 4.1 models
الأداء
متعدد الوسائط
المعايير
الوكلاء الذين يستخدمون هذا النموذج
3GPT-4o Mini
LiteFast, cost-efficient tasks
متى تستخدم: Great for high-volume, cost-sensitive workloads like chatbots and summarization.
أبرز التحديثات
- ◆Replaced GPT-3.5 Turbo: 46% cheaper, significantly better quality
- ◆Same 128K context as GPT-4o at 1/17th the price
- ◆Scored 82% on MMLU — approaching GPT-4 level on knowledge tasks
- ◆Vision + function calling at $0.15/M — previously flagship-only features
- ◆Fine-tuning available — best cost/quality ratio for tuned models
المزايا
- Extremely affordable
- Same 128K context as GPT-4o
- Supports vision + function calling
العيوب
- Lower reasoning quality than flagship
- Not ideal for complex multi-step tasks
الأداء
متعدد الوسائط
المعايير
GPT-4.1
FlagshipCoding, instruction following
متى تستخدم: Top pick for coding assistants and long-document analysis.
أبرز التحديثات
- ◆Context: 128K → 1M tokens (8x jump from GPT-4o)
- ◆SWE-bench Verified: 54.6% — best coding score at launch
- ◆Max output: 32K tokens (2x GPT-4o's 16K)
- ◆Instruction following: 40% better on IFEval vs GPT-4o
- ◆Cheaper than GPT-4o ($2 vs $2.50/M input) with more capability
المزايا
- 1M token context window
- Best-in-class coding ability
- Cheaper than GPT-4o with better context
العيوب
- Slower than mini/nano for simple tasks
- Newer model, less battle-tested
الأداء
متعدد الوسائط
المعايير
الوكلاء الذين يستخدمون هذا النموذج
81GPT-4.1 Mini
Mid-tierBalanced performance & cost
متى تستخدم: Sweet spot for production apps needing long context without flagship costs.
أبرز التحديثات
- ◆1M context at $0.40/M input — 5x cheaper than GPT-4.1 flagship
- ◆Same 1M context + 32K output as flagship for production use
- ◆Full multimodal support (vision, function calling, fine-tuning)
- ◆SWE-bench: 42.8% — strong coding at mid-tier pricing
- ◆Replaced GPT-4o Mini for long-context production workloads
المزايا
- 1M context at mid-tier price
- Strong coding for the price
- Full multimodal support
العيوب
- Quality gap vs flagship for complex reasoning
- Still more expensive than nano
الأداء
متعدد الوسائط
المعايير
GPT-4.1 Nano
LiteUltra-low latency, edge devices
متى تستخدم: Ideal for real-time classification, extraction, and high-throughput pipelines.
أبرز التحديثات
- ◆Cheapest model with 1M context: $0.10/M input (GPT-3.5 Turbo was $0.50)
- ◆Full feature set at lite tier: vision + function calling + fine-tuning
- ◆Context: 128K → 1M (8x increase over GPT-4o Mini's 128K)
- ◆Fastest response times in OpenAI lineup — sub-200ms typical latency
- ◆Cached input at $0.025/M — 97.5% savings for repeated prefixes
المزايا
- Cheapest OpenAI model with 1M context
- Fastest response times
- Full feature set (vision, FC, fine-tuning)
العيوب
- Noticeable quality drop for complex tasks
- Not suitable for deep reasoning
الأداء
متعدد الوسائط
المعايير
o3
ReasoningComplex reasoning, math, science
متى تستخدم: Use for problems requiring step-by-step reasoning, math proofs, and scientific analysis.
أبرز التحديثات
- ◆ARC-AGI: 87.5% — near-human on novel abstract reasoning tasks
- ◆Max output: 100K tokens (4x o1's 25K) for deep chain-of-thought
- ◆Vision + function calling — new capabilities vs o1's text-only
- ◆Math Olympiad: Gold medal level on AIME 2024 (93.3%)
- ◆Price: $2/$8 — 1/3 the cost of o1 Pro at $200/M
المزايا
- State-of-the-art reasoning ability
- 100K max output for long chain-of-thought
- Excellent at math and science
العيوب
- Slower due to thinking time
- No fine-tuning support
- Higher cost for reasoning tokens
الأداء
متعدد الوسائط
المعايير
الوكلاء الذين يستخدمون هذا النموذج
1o4-mini
ReasoningReasoning at lower cost
متى تستخدم: Best value for reasoning workloads — great for production where o3 is overkill.
أبرز التحديثات
- ◆55% cheaper than o3 ($1.10 vs $2/M input) for reasoning tasks
- ◆Same 100K max output as o3 — no compromise on reasoning depth
- ◆Faster inference than o3 with optimized thinking budget
- ◆Vision + function calling at reasoning tier — previously o3-only
- ◆Cached input: $0.275/M — 75% savings for repeated prompts
المزايا
- 55% cheaper than o3 for reasoning
- Same 100K max output
- Faster than o3
العيوب
- Less capable than o3 on hardest problems
- No fine-tuning
- Reasoning tokens add hidden cost
الأداء
متعدد الوسائط
المعايير
الوكلاء الذين يستخدمون هذا النموذج
1مقارنة جنبًا إلى جنب
| النموذج | المستوى | الإدخال | الإخراج | السياق |
|---|---|---|---|---|
| GPT-5.5 | Flagship | $5.00 | $30.00 | 1M |
| GPT-5.5 Pro | Reasoning | $30.00 | $180.00 | 1M |
| GPT-5.4 | Flagship | $2.50 | $15.00 | 1M |
| GPT-5.4 Mini | Mid-tier | $0.750 | $4.50 | 400K |
| GPT-5.4 Nano | Lite | $0.200 | $1.25 | 272K |
| GPT-4o | Flagship | $2.50 | $10.00 | 128K |
| GPT-4o Mini | Lite | $0.150 | $0.600 | 128K |
| GPT-4.1 | Flagship | $2.00 | $8.00 | 1.0M |
| GPT-4.1 Mini | Mid-tier | $0.400 | $1.60 | 1.0M |
| GPT-4.1 Nano | Lite | $0.100 | $0.400 | 1.0M |
| o3 | Reasoning | $2.00 | $8.00 | 200K |
| o4-mini | Reasoning | $1.10 | $4.40 | 200K |