Meta Models

Explore all 2 models from Meta with detailed pricing, pros & cons, and developer recommendations.

Models

$0.100

Lowest Input

10M

Max Context

Quality Tiers

Quick Recommendations

Best Value: Llama 4 Scout ($0.100/1M)

Best Quality: Llama 4 Maverick

Llama 4 Maverick

Flagship

Open-source, multimodal

Official Pricing

When to use: For teams wanting open-source control or self-hosting with multimodal needs.

Upgrade Highlights

◆Open-source — self-host for free, full model weight control
◆1M context window — first open-source model with this capacity
◆Multimodal (text + vision) + fine-tunable — unique combination
◆17B active params (109B total) — MoE architecture for efficiency
◆4K max output is limiting — use for input-heavy, short-output tasks

Input Price

$0.200

per 1M tokens

Output Price

$0.600

per 1M tokens

Cached Input

—

per 1M tokens

Batch Input

—

per 1M tokens

Context Window: 1M

Max Output: 4,096 tokens

Knowledge Cutoff: 2024-08

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

Open-source — can self-host for free
1M context window
Multimodal + fine-tunable

Cons

Only 4K max output
No JSON mode
Hosted pricing via third-party (Together AI)

Performance

Output Speed~80 tok/s

Rate Limit—

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

84.5%

HumanEval

83.0%

SWE-bench Verified

44.2%

Llama 4 Scout

Mid-tier

Open-source, long context

Official Pricing

When to use: Unmatched for processing very long documents. Best for RAG with massive context windows.

Upgrade Highlights

◆10M token context — 10x larger than any other model available
◆Open-source + fine-tunable — self-host for unlimited usage
◆$0.10/M input — cheapest per-token model in the market
◆17B active params (109B total) — same efficient MoE as Maverick
◆4K max output — designed for retrieval/analysis, not long generation

Input Price

$0.100

per 1M tokens

Output Price

$0.300

per 1M tokens

Cached Input

—

per 1M tokens

Batch Input

—

per 1M tokens

Context Window: 10M

Max Output: 4,096 tokens

Knowledge Cutoff: 2024-08

VisionFunction CallingFine-tuningJSON ModeFree Tier

Pros

10M token context — largest available
Cheapest per-token model
Open-source + fine-tunable

Cons

Only 4K max output
No JSON mode
Quality below proprietary flagships

Performance

Output Speed~90 tok/s

Rate Limit—

Multimodal

Image InputImage OutputAudio InputAudio Output

Benchmarks

MMLU

81.0%

HumanEval

78.5%

Side-by-Side Comparison

Model	Tier	Input	Output	Cached	Context	Max Output
Llama 4 Maverick	Flagship	$0.200	$0.600	—	1M	4,096
Llama 4 Scout	Mid-tier	$0.100	$0.300	—	10M	4,096