Pricing Guide
Mistral offers competitive pricing across its entire model lineup — from the budget Ministral series to the frontier Mistral Large. This page covers every model, what it costs, and how it compares.
Pricing verified against official Mistral documentation. Updated daily.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Intelligence Index | Tier |
|---|---|---|---|---|
| Magistral Medium 1.2 | $2.00 | $5.00 | 27.1 | Mid-tier |
| Mistral Large 3 | $0.50 | $1.50 | 22.8 | Budget |
| Mistral Medium 3.1 | $0.40 | $2.00 | 21.3 | Budget |
| Mistral Medium 3 | $0.40 | $2.00 | 18.8 | Budget |
| Devstral Medium | $0.40 | $2.00 | 18.7 | Budget |
| Magistral Small 1.2 | $0.50 | $1.50 | 18.2 | Budget |
| Devstral Small | $0.10 | $0.30 | 18.0 | Budget |
| Ministral 3 14B | $0.20 | $0.20 | 16.0 | Budget |
| Ministral 3 8B | $0.15 | $0.15 | 15.3 | Budget |
| Devstral Small | $0.10 | $0.30 | 15.2 | Budget |
| Mistral Large 2 | $2.00 | $6.00 | 15.1 | Budget |
| Mistral Small 3.2 | $0.10 | $0.30 | 15.1 | Budget |
| Mistral Small 3.1 | $0.10 | $0.30 | 14.5 | Budget |
| Pixtral Large | $2.00 | $6.00 | 14.0 | Budget |
| Mistral Large 2 | $2.00 | $6.00 | 13.0 | Budget |
| Ministral 3 3B | $0.10 | $0.10 | 12.9 | Budget |
| Mistral Small 3 | $0.10 | $0.30 | 12.7 | Budget |
| Mistral Small | $0.10 | $0.30 | 10.2 | Budget |
| Mistral Large | $4.00 | $12.00 | 9.9 | Budget |
| Mistral Small | $1.00 | $3.00 | 9.0 | Budget |
| Mistral Medium | $2.75 | $8.10 | 9.0 | Budget |
| Mixtral 8x7B Instruct | $0.54 | $0.54 | 7.7 | Budget |
| Mistral 7B Instruct | $0.20 | $0.20 | 7.4 | Budget |
Prices in USD. Updated daily. 29 Mistral models tracked.
Estimate your cost for a single API call or a batch of requests.
Estimated cost
$0.00
Mistral organizes its models into clear tiers. The Small models (Mistral Small 3, Small 3.1) target high-throughput, low-cost workloads — classification, extraction, simple Q&A. At $0.10 per million input tokens, they are among the cheapest production-grade models from any provider.
The Medium tier fills the gap between budget and frontier, handling more complex instructions and nuanced generation without frontier prices.
The Large models (Mistral Large, Mistral Large 3) are the frontier offering. Mistral Large 3 at $0.50/$1.50 per million tokens undercuts GPT-4.1 ($2.00/$8.00) by 4x on input and over 5x on output — one of the most cost-effective frontier-class models available.
The Ministral series sits at the bottom of the cost curve. Ministral 3B and 8B handle simple classification, intent detection, and entity extraction at a fraction of the Small tier's cost.
The key decision is not which single model to use, but which combination to deploy. Route simple tasks to Small, complex reasoning to Large or Magistral, and code generation to Devstral. This tiered approach can reduce API costs by 50-80% compared to sending everything to a frontier model — but it requires per-feature cost tracking to verify the routing is saving money.
Mistral consistently undercuts OpenAI across comparable model tiers. The savings are most dramatic at the frontier tier. At the budget tier, Mistral Small and GPT-4o mini are closer in price, but Mistral still comes in cheaper.
| Model | Input (per 1M) | Output (per 1M) | Intelligence Index |
|---|---|---|---|
| Mistral Large 3 | $0.50 | $1.50 | 22.8 |
| GPT-4.1 | $2.00 | $8.00 | 26.3 |
| Mistral Small 3.2 | $0.10 | $0.30 | 15.1 |
| GPT-4o mini | $0.15 | $0.60 | 12.6 |
The savings compound at scale. A team processing 10 million tokens per day on GPT-4.1 would spend roughly $20 on input alone. The same volume on Mistral Large 3 would cost around $5. Over a month, that difference adds up to hundreds of dollars. Whether the quality difference matters depends on your use case.
Magistral is Mistral's reasoning model line, comparable to OpenAI's o-series. These models use chain-of-thought reasoning to work through complex problems before producing a final answer. Magistral Medium costs $2.00/$5.00 per million tokens; Magistral Small 1.2 costs $0.50/$1.50. Because reasoning generates more output tokens per request, actual per-request costs run higher than a simple token-count estimate suggests.
Devstral is Mistral's coding-focused model family, optimized for code generation, debugging, and technical documentation. Pricing sits between the Small and Large tiers — a cost-effective choice for heavy code-generation workloads.
Pixtral is Mistral's multimodal family, capable of processing both text and images. It enables document understanding, chart reading, and image-based classification within the Mistral ecosystem.
The specialized model strategy is where Mistral's lineup creates real value. Route coding tasks to Devstral, reasoning to Magistral, vision to Pixtral, and everything else to Small or Large. The challenge is knowing which routing decisions actually save money — which requires tracking cost per feature, not just aggregate spend.
Mistral is unique among major providers in offering many models as open weights under permissive licenses. Mistral Small, Ministral 3B, Ministral 8B, and Devstral are available for self-hosting, giving teams a choice OpenAI and Anthropic do not offer: pay per token through the API, or run the model on your own GPU instances.
The economics depend on volume. A GPU instance costs a fixed hourly rate regardless of request count. At low volume, per-token cost is higher than the API because you pay for idle time. At high volume, the math inverts — if you keep the GPU saturated, effective per-token cost drops well below API pricing. The breakeven typically falls in the range of tens of millions of tokens per day.
Self-hosting introduces operational overhead: model serving infrastructure (vLLM, TGI), GPU monitoring, scaling, and model updates. For teams without dedicated ML infrastructure, the API is almost always the better choice.
The larger models — Mistral Large, Magistral Medium — are API-only. This creates a natural production split: self-host smaller open-weight models for high-volume, cost-sensitive tasks, and use the API for heavier reasoning that requires Large or Magistral.
Mistral uses per-token billing. Every API call is metered by input tokens (your prompt, system message, and context) and output tokens (the model's response), charged at different rates per model. There are no per-request fees, no minimum monthly commitments on the standard tier, and no charges for failed requests. You pay only for tokens successfully processed.
To get started, create an API key at console.mistral.ai. API keys can be scoped to specific projects to separate billing across teams. Mistral bills monthly based on accumulated usage. For teams with predictable high volume, Mistral offers committed-use pricing through direct sales.
Mistral's API is OpenAI-compatible. The chat completions endpoint follows the same request and response format, so most OpenAI client libraries and frameworks (LangChain, LiteLLM, OpenAI Python SDK) work with Mistral by changing the base URL and API key. Switching providers or running both in parallel is a configuration change, not an engineering project.
For simple classification, entity extraction, and intent detection, start with Ministral 3B. It handles structured tasks with predictable inputs at the lowest cost in the lineup. If accuracy is insufficient, step up to Mistral Small — but test first, because the 3B model handles more than most teams expect.
For conversational AI, summarization, and content generation, Mistral Small is the default production choice. It balances quality and cost for most customer-facing features. Most teams running chat at scale find Small handles 80% or more of requests adequately.
For complex reasoning and multi-step analysis, move to Mistral Large or Magistral. Large is the general-purpose frontier model. Magistral adds chain-of-thought reasoning, improving accuracy on math, logic, and multi-step problems but generating substantially more output tokens per request.
For code generation and code review, Devstral is purpose-built. It outperforms general-purpose models on coding benchmarks while costing less than the Large tier. The cost-per-quality ratio for coding tasks is often significantly better with a specialized model.
No credit card required
Knowing the price per token is the first step. Knowing how much each customer costs you — and whether they are profitable — is the step most teams skip. MarginDash connects Mistral usage to Stripe revenue and shows you margin per customer.
See My Margin DataNo credit card required
Create an account, install the SDK, and see your first margin data in minutes.
See My Margin DataNo credit card required