Flash — fastest and cheapest, lighter reasoningOpen Weights

Llama 3.3 Nemotron Super 49B V1.5 Pricing

NVIDIA · nvidia-llama-3-3-nemotron-super-49b-v1-5

Input / 1M tokens

$0.400

Output / 1M tokens

$0.400

Context window

131,072 tokens

≈ 175 pages of text

Max output

16,384 tokens

What it costs in practice

Typical request (1,200 in + 400 out tokens)$0.0006

1,000 requests / month$0.64/mo 10,000 requests / month$6.40/mo 100,000 requests / month$64.00/mo

Estimate your own workloadOpens the calculator with Llama 3.3 Nemotron Super 49B V1.5 preloaded — adjust volume and token counts there.

Price history

Only one price point recorded so far.

The staircase chart appears once a price change is detected.

Pricing and specs for this model are auto-synced from OpenRouter, but it isn't one of our editorial picks yet — no “what it excels at” write-up or business tradeoff notes. Any tier badge above is estimated from price.

Cheaper from NVIDIA

Nemotron 3 SuperNVIDIA$0.0003/ typical request

Same tier from other providers

GPT-5.4 NanoOpenAI$0.0007/ typical request Claude Haiku 4.5Anthropic$0.0032/ typical request Gemini 3.5 FlashGoogle$0.0054/ typical request

Head-to-head

Llama 3.3 Nemotron Super 49B V1.5 vs GPT-5.4 Nano →Llama 3.3 Nemotron Super 49B V1.5 vs Claude Haiku 4.5 →Llama 3.3 Nemotron Super 49B V1.5 vs Gemini 3.5 Flash →

Vendor list rates, as of Jun 15, 2026 · source: openrouter · per-request examples assume 1,200 input + 400 output tokens.