Meta Modelle

Entdecken Sie alle 2 Modelle von Meta mit detaillierten Preisen, Vor- und Nachteilen sowie Entwicklerempfehlungen.

Modelle

$0.100

Niedrigster Input

10M

Max. Kontext

Qualitätsstufen

Schnellempfehlungen

Bestes Preis-Leistungs-Verhältnis: Llama 4 Scout ($0.100/1M)

Beste Qualität: Llama 4 Maverick

Llama 4 Maverick

Flagship

Open-source, multimodal

Offizielle Preise

Wann verwenden: For teams wanting open-source control or self-hosting with multimodal needs.

Upgrade-Highlights

◆Open-source — self-host for free, full model weight control
◆1M context window — first open-source model with this capacity
◆Multimodal (text + vision) + fine-tunable — unique combination
◆17B active params (109B total) — MoE architecture for efficiency
◆4K max output is limiting — use for input-heavy, short-output tasks

Input-Preis

$0.200

per 1M tokens

Output-Preis

$0.600

per 1M tokens

Cached Input

—

per 1M tokens

Batch-Input

—

per 1M tokens

Kontextfenster: 1M

Max. Output: 4,096 tokens

Wissensstand: 2024-08

VisionFunktionsaufrufFeinabstimmungJSON-ModusKostenlose Stufe

Vorteile

Open-source — can self-host for free
1M context window
Multimodal + fine-tunable

Nachteile

Only 4K max output
No JSON mode
Hosted pricing via third-party (Together AI)

Leistung

Ausgabegeschwindigkeit~80 tok/s

Rate-Limit—

Multimodal

BildeingabeBildausgabeAudioeingabeAudioausgabe

Benchmarks

MMLU

84.5%

HumanEval

83.0%

SWE-bench Verified

44.2%

Llama 4 Scout

Mid-tier

Open-source, long context

Offizielle Preise

Wann verwenden: Unmatched for processing very long documents. Best for RAG with massive context windows.

Upgrade-Highlights

◆10M token context — 10x larger than any other model available
◆Open-source + fine-tunable — self-host for unlimited usage
◆$0.10/M input — cheapest per-token model in the market
◆17B active params (109B total) — same efficient MoE as Maverick
◆4K max output — designed for retrieval/analysis, not long generation

Input-Preis

$0.100

per 1M tokens

Output-Preis

$0.300

per 1M tokens

Cached Input

—

per 1M tokens

Batch-Input

—

per 1M tokens

Kontextfenster: 10M

Max. Output: 4,096 tokens

Wissensstand: 2024-08

VisionFunktionsaufrufFeinabstimmungJSON-ModusKostenlose Stufe

Vorteile

10M token context — largest available
Cheapest per-token model
Open-source + fine-tunable

Nachteile

Only 4K max output
No JSON mode
Quality below proprietary flagships

Leistung

Ausgabegeschwindigkeit~90 tok/s

Rate-Limit—

Multimodal

BildeingabeBildausgabeAudioeingabeAudioausgabe

Benchmarks

MMLU

81.0%

HumanEval

78.5%

Nebeneinander-Vergleich

Modell	Stufe	Input	Output	Cached	Kontext	Max. Output
Llama 4 Maverick	Flagship	$0.200	$0.600	—	1M	4,096
Llama 4 Scout	Mid-tier	$0.100	$0.300	—	10M	4,096