📊 Per-Token (Consumption) — Usage Parameters
🔍 Select AI Models
Click to select/deselect models for comparison (0 selected)
Per-Token (consumption) & Provisioned Throughput (PTU) — side by side
Click to select/deselect models for comparison (0 selected)
| Model | Input Price (per 1M) | Output Price (per 1M) | Monthly Cost | Annual Cost | Cost per Developer | Cost per 1K Interactions | Best For |
|---|
Real Azure PTU pricing — billed per PTU per hour (model-independent), not per token.
Source: Azure Retail Prices API · serviceName eq 'Foundry Models'.
| Type | Data routing | $/PTU/hr | Min PTU (OpenAI) | Best for |
|---|---|---|---|---|
| Global | Routed across Azure regions globally | $1.00 | 15 (×5) | Highest availability; no region constraint |
| Data Zone | Stays within a geographic zone (US or EU) | $1.10 | 15 (×5) | Zone-level data residency + higher availability than regional |
| Regional | Stays in ONE specific Azure region | $2.00 | 50 (×50) | Strict single-region data residency / compliance |
Key differences: Global is cheapest ($1/PTU/hr) and routes traffic anywhere for max availability. Data Zone keeps data within a geography (US or EU) at a slight premium. Regional guarantees a single Azure region — strictest compliance, highest price ($2/PTU/hr), larger min deployment (50 PTU).
📚 References: What is PTU? · PTU sizing & per-model values · PTU billing & reservations · All deployment types
Uses the official sizing formula:
PTUs = ((Input TPM × (1 − cache)) + (ratio × Output TPM)) ÷ Input TPM per PTU,
rounded up to the deployment increment.
Values sourced from Microsoft sizing docs.
Click a scenario to apply preset parameters
Total Monthly Tokens:
Input/Output Tokens:
Monthly Cost per Model:
Cost per Developer:
Example for 10 Developers:
Primary Source:
API Query Example:
Secondary Sources:
Data Accuracy:
✅ Real-time pricing from Azure API
✅ No estimates - actual retail prices
✅ Enterprise discounts accounted for
✅ Regional variations included