Pricing Guide

Mistral API Pricing: Every Model, Every Tier

Mistral offers competitive pricing across its entire model lineup — from the budget Ministral series to the frontier Mistral Large. This page covers every model, what it costs, and how it compares.

Pricing verified against official Mistral documentation. Updated daily.

All Mistral Models and Pricing

Model Input (per 1M tokens) Output (per 1M tokens) Intelligence Index Tier
Magistral Medium 1.2 $2.00 $5.00 27.1 Mid-tier
Mistral Large 3 $0.50 $1.50 22.8 Budget
Mistral Medium 3.1 $0.40 $2.00 21.3 Budget
Mistral Medium 3 $0.40 $2.00 18.8 Budget
Devstral Medium $0.40 $2.00 18.7 Budget
Magistral Small 1.2 $0.50 $1.50 18.2 Budget
Devstral Small $0.10 $0.30 18.0 Budget
Ministral 3 14B $0.20 $0.20 16.0 Budget
Ministral 3 8B $0.15 $0.15 15.3 Budget
Devstral Small $0.10 $0.30 15.2 Budget
Mistral Large 2 $2.00 $6.00 15.1 Budget
Mistral Small 3.2 $0.10 $0.30 15.1 Budget
Mistral Small 3.1 $0.10 $0.30 14.5 Budget
Pixtral Large $2.00 $6.00 14.0 Budget
Mistral Large 2 $2.00 $6.00 13.0 Budget
Ministral 3 3B $0.10 $0.10 12.9 Budget
Mistral Small 3 $0.10 $0.30 12.7 Budget
Mistral Small $0.10 $0.30 10.2 Budget
Mistral Large $4.00 $12.00 9.9 Budget
Mistral Small $1.00 $3.00 9.0 Budget
Mistral Medium $2.75 $8.10 9.0 Budget
Mixtral 8x7B Instruct $0.54 $0.54 7.7 Budget
Mistral 7B Instruct $0.20 $0.20 7.4 Budget

Prices in USD. Updated daily. 29 Mistral models tracked.

Mistral Cost Calculator

Estimate your cost for a single API call or a batch of requests.

Estimated cost

$0.00

Mistral's Model Lineup Explained: Small vs Medium vs Large

Mistral organizes its models into clear tiers. The Small models (Mistral Small 3, Small 3.1) target high-throughput, low-cost workloads — classification, extraction, simple Q&A. At $0.10 per million input tokens, they are among the cheapest production-grade models from any provider.

The Medium tier fills the gap between budget and frontier, handling more complex instructions and nuanced generation without frontier prices.

The Large models (Mistral Large, Mistral Large 3) are the frontier offering. Mistral Large 3 at $0.50/$1.50 per million tokens undercuts GPT-4.1 ($2.00/$8.00) by 4x on input and over 5x on output — one of the most cost-effective frontier-class models available.

The Ministral series sits at the bottom of the cost curve. Ministral 3B and 8B handle simple classification, intent detection, and entity extraction at a fraction of the Small tier's cost.

The key decision is not which single model to use, but which combination to deploy. Route simple tasks to Small, complex reasoning to Large or Magistral, and code generation to Devstral. This tiered approach can reduce API costs by 50-80% compared to sending everything to a frontier model — but it requires per-feature cost tracking to verify the routing is saving money.

Mistral vs OpenAI Pricing: Where Mistral Wins on Cost

Mistral consistently undercuts OpenAI across comparable model tiers. The savings are most dramatic at the frontier tier. At the budget tier, Mistral Small and GPT-4o mini are closer in price, but Mistral still comes in cheaper.

Model Input (per 1M) Output (per 1M) Intelligence Index
Mistral Large 3 $0.50 $1.50 22.8
GPT-4.1 $2.00 $8.00 26.3
Mistral Small 3.2 $0.10 $0.30 15.1
GPT-4o mini $0.15 $0.60 12.6

The savings compound at scale. A team processing 10 million tokens per day on GPT-4.1 would spend roughly $20 on input alone. The same volume on Mistral Large 3 would cost around $5. Over a month, that difference adds up to hundreds of dollars. Whether the quality difference matters depends on your use case.

Magistral and Devstral: Mistral's Specialized Models

Magistral is Mistral's reasoning model line, comparable to OpenAI's o-series. These models use chain-of-thought reasoning to work through complex problems before producing a final answer. Magistral Medium costs $2.00/$5.00 per million tokens; Magistral Small 1.2 costs $0.50/$1.50. Because reasoning generates more output tokens per request, actual per-request costs run higher than a simple token-count estimate suggests.

Devstral is Mistral's coding-focused model family, optimized for code generation, debugging, and technical documentation. Pricing sits between the Small and Large tiers — a cost-effective choice for heavy code-generation workloads.

Pixtral is Mistral's multimodal family, capable of processing both text and images. It enables document understanding, chart reading, and image-based classification within the Mistral ecosystem.

The specialized model strategy is where Mistral's lineup creates real value. Route coding tasks to Devstral, reasoning to Magistral, vision to Pixtral, and everything else to Small or Large. The challenge is knowing which routing decisions actually save money — which requires tracking cost per feature, not just aggregate spend.

Open-Weight Models vs API: When to Self-Host

Mistral is unique among major providers in offering many models as open weights under permissive licenses. Mistral Small, Ministral 3B, Ministral 8B, and Devstral are available for self-hosting, giving teams a choice OpenAI and Anthropic do not offer: pay per token through the API, or run the model on your own GPU instances.

The economics depend on volume. A GPU instance costs a fixed hourly rate regardless of request count. At low volume, per-token cost is higher than the API because you pay for idle time. At high volume, the math inverts — if you keep the GPU saturated, effective per-token cost drops well below API pricing. The breakeven typically falls in the range of tens of millions of tokens per day.

Self-hosting introduces operational overhead: model serving infrastructure (vLLM, TGI), GPU monitoring, scaling, and model updates. For teams without dedicated ML infrastructure, the API is almost always the better choice.

The larger models — Mistral Large, Magistral Medium — are API-only. This creates a natural production split: self-host smaller open-weight models for high-volume, cost-sensitive tasks, and use the API for heavier reasoning that requires Large or Magistral.

How Mistral API Billing Works

Mistral uses per-token billing. Every API call is metered by input tokens (your prompt, system message, and context) and output tokens (the model's response), charged at different rates per model. There are no per-request fees, no minimum monthly commitments on the standard tier, and no charges for failed requests. You pay only for tokens successfully processed.

To get started, create an API key at console.mistral.ai. API keys can be scoped to specific projects to separate billing across teams. Mistral bills monthly based on accumulated usage. For teams with predictable high volume, Mistral offers committed-use pricing through direct sales.

Mistral's API is OpenAI-compatible. The chat completions endpoint follows the same request and response format, so most OpenAI client libraries and frameworks (LangChain, LiteLLM, OpenAI Python SDK) work with Mistral by changing the base URL and API key. Switching providers or running both in parallel is a configuration change, not an engineering project.

Choosing the Right Mistral Model for Production

For simple classification, entity extraction, and intent detection, start with Ministral 3B. It handles structured tasks with predictable inputs at the lowest cost in the lineup. If accuracy is insufficient, step up to Mistral Small — but test first, because the 3B model handles more than most teams expect.

For conversational AI, summarization, and content generation, Mistral Small is the default production choice. It balances quality and cost for most customer-facing features. Most teams running chat at scale find Small handles 80% or more of requests adequately.

For complex reasoning and multi-step analysis, move to Mistral Large or Magistral. Large is the general-purpose frontier model. Magistral adds chain-of-thought reasoning, improving accuracy on math, logic, and multi-step problems but generating substantially more output tokens per request.

For code generation and code review, Devstral is purpose-built. It outperforms general-purpose models on coding benchmarks while costing less than the Large tier. The cost-per-quality ratio for coding tasks is often significantly better with a specialized model.

Track Your Mistral API Costs

No credit card required

Frequently Asked Questions

How much does the Mistral API cost?
Mistral's pricing ranges from $0.10 per million input tokens (Mistral Small) to $2.00 per million input tokens (Magistral Medium, Mistral Large). Output pricing ranges from $0.10 (Ministral 3B) to $6.00 (Mistral Large 2). Mistral offers some of the most affordable models in the mid-tier category.
What is the cheapest Mistral model?
Mistral Small 3 and Mistral Small 3.1 are the cheapest at $0.10 per million input tokens and $0.30 per million output tokens. For even lower costs, Ministral 3B costs $0.10/$0.10.
How does Mistral pricing compare to OpenAI?
Mistral is generally cheaper than OpenAI for comparable model tiers. Mistral Large 3 costs $0.50/$1.50 per million tokens vs GPT-4.1 at $2.00/$8.00. Mistral Small costs $0.10/$0.30 vs GPT-4o mini at $0.15/$0.60. The savings increase significantly at the larger model tier.
What is Magistral and how is it priced?
Magistral is Mistral's reasoning model line, similar to OpenAI's o-series. Magistral Medium costs $2.00/$5.00 per million tokens and Magistral Small 1.2 costs $0.50/$1.50. These models use chain-of-thought reasoning for complex tasks.
Is the Mistral API compatible with OpenAI client libraries?
Yes. Mistral's chat completions endpoint follows the same request and response format as OpenAI's API. Most OpenAI client libraries and frameworks (LangChain, LiteLLM, OpenAI Python SDK) can connect to Mistral by changing the base URL and API key.

Track your Mistral API costs per customer

Knowing the price per token is the first step. Knowing how much each customer costs you — and whether they are profitable — is the step most teams skip. MarginDash connects Mistral usage to Stripe revenue and shows you margin per customer.

See My Margin Data

No credit card required

Stop guessing. Start measuring.

Create an account, install the SDK, and see your first margin data in minutes.

See My Margin Data

No credit card required