Methodology

How we source, calculate, and update the data on MarginDash.

Pricing Data

Synced daily. Covers 400+ models across OpenAI, Anthropic, Google, AWS Bedrock, Azure, and Groq.

Benchmarks

Three standardized evaluations: MMLU-Pro (general knowledge and reasoning), GPQA (graduate-level science), and AIME (mathematical problem solving). Scores sourced from Artificial Analysis, which runs independent evaluations of each model.

Intelligence Index

Composite score combining MMLU-Pro, GPQA, and AIME. Higher means better overall capability across reasoning domains.

Cost Per Intelligence Point

Typical request cost (5,000 input tokens + 1,000 output tokens) divided by Intelligence Index score. Lower means more capability per dollar. Actual costs depend on your workload's input/output ratio.

Cost Comparisons

All per-request costs assume 5,000 input and 1,000 output tokens (5:1 ratio). Actual ratios vary — chat typically runs 2:1, code review 3:1, document summarization 10:1 to 50:1.

Update Frequency

Pricing and benchmark data refreshed daily via automated sync.

What the SDK Collects

Model name, token counts, customer ID, and optional revenue amount. The SDK never sends prompts or responses.