Model pricing
The following table shows pricing for all Claude models across different usage tiers:Model | Base Input Tokens | 5m Cache Writes | 1h Cache Writes | Cache Hits & Refreshes | Output Tokens |
---|---|---|---|---|---|
Claude Opus 4.1 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok |
Claude Opus 4 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok |
Claude Sonnet 4 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok |
Claude Sonnet 3.7 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok |
Claude Sonnet 3.5 (deprecated) | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok |
Claude Haiku 3.5 | $0.80 / MTok | $1 / MTok | $1.6 / MTok | $0.08 / MTok | $4 / MTok |
Claude Opus 3 (deprecated) | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok |
Claude Haiku 3 | $0.25 / MTok | $0.30 / MTok | $0.50 / MTok | $0.03 / MTok | $1.25 / MTok |
MTok = Million tokens. The “Base Input Tokens” column shows standard input pricing, “Cache Writes” and “Cache Hits” are specific to prompt caching, and “Output Tokens” shows output pricing. Prompt caching offers both 5-minute (default) and 1-hour cache durations to optimize costs for different use cases.The table above reflects the following pricing multipliers for prompt caching:
- 5-minute cache write tokens are 1.25 times the base input tokens price
- 1-hour cache write tokens are 2 times the base input tokens price
- Cache read tokens are 0.1 times the base input tokens price
Feature-specific pricing
Batch processing
The Batch API allows asynchronous processing of large volumes of requests with a 50% discount on both input and output tokens.Model | Batch input | Batch output |
---|---|---|
Claude Opus 4.1 | $7.50 / MTok | $37.50 / MTok |
Claude Opus 4 | $7.50 / MTok | $37.50 / MTok |
Claude Sonnet 4 | $1.50 / MTok | $7.50 / MTok |
Claude Sonnet 3.7 | $1.50 / MTok | $7.50 / MTok |
Claude Sonnet 3.5 (deprecated) | $1.50 / MTok | $7.50 / MTok |
Claude Haiku 3.5 | $0.40 / MTok | $2 / MTok |
Claude Opus 3 (deprecated) | $7.50 / MTok | $37.50 / MTok |
Claude Haiku 3 | $0.125 / MTok | $0.625 / MTok |
Long context pricing
When using Claude Sonnet 4 with the 1M token context window enabled, requests that exceed 200K input tokens are automatically charged at premium long context rates:The 1M token context window is currently in beta for organizations in usage tier 4 and organizations with custom rate limits. The 1M token context window is only available for Claude Sonnet 4.
≤ 200K input tokens | > 200K input tokens |
---|---|
Input: $3 / MTok | Input: $6 / MTok |
Output: $15 / MTok | Output: $22.50 / MTok |
- The Batch API 50% discount applies to long context pricing
- Prompt caching multipliers apply on top of long context pricing
Even with the beta flag enabled, requests with fewer than 200K input tokens are charged at standard rates. If your request exceeds 200K input tokens, all tokens incur premium pricing.The 200K threshold is based solely on input tokens (including cache reads/writes). Output token count does not affect pricing tier selection, though output tokens are charged at the higher rate when the input threshold is exceeded.
usage
object in the API response:
input_tokens
cache_creation_input_tokens
(if using prompt caching)cache_read_input_tokens
(if using prompt caching)
usage
object, see the API response documentation.
Tool use pricing
Tool use requests are priced based on:- The total number of input tokens sent to the model (including in the
tools
parameter) - The number of output tokens generated
- For server-side tools, additional usage-based pricing (e.g., web search charges per search performed)
- The
tools
parameter in API requests (tool names, descriptions, and schemas) tool_use
content blocks in API requests and responsestool_result
content blocks in API requests
tools
, we also automatically include a special system prompt for the model which enables tool use. The number of tool use tokens required for each model are listed below (excluding the additional tokens listed above). Note that the table assumes at least 1 tool is provided. If no tools
are provided, then a tool choice of none
uses 0 additional system prompt tokens.
Model | Tool choice | Tool use system prompt token count |
---|---|---|
Claude Opus 4.1 | auto , none any , tool | 346 tokens 313 tokens |
Claude Opus 4 | auto , none any , tool | 346 tokens 313 tokens |
Claude Sonnet 4 | auto , none any , tool | 346 tokens 313 tokens |
Claude Sonnet 3.7 | auto , none any , tool | 346 tokens 313 tokens |
Claude Sonnet 3.5 (Oct) (deprecated) | auto , none any , tool | 346 tokens 313 tokens |
Claude Sonnet 3.5 (June) (deprecated) | auto , none any , tool | 294 tokens 261 tokens |
Claude Haiku 3.5 | auto , none any , tool | 264 tokens 340 tokens |
Claude Opus 3 (deprecated) | auto , none any , tool | 530 tokens 281 tokens |
Claude Sonnet 3 | auto , none any , tool | 159 tokens 235 tokens |
Claude Haiku 3 | auto , none any , tool | 264 tokens 340 tokens |
Specific tool pricing
Bash tool
The bash tool adds 245 input tokens to your API calls. Additional tokens are consumed by:- Command outputs (stdout/stderr)
- Error messages
- Large file contents
Code execution tool
The code execution tool usage is tracked separately from token usage. Execution time is a minimum of 5 minutes. If files are included in the request, execution time is billed even if the tool is not used due to files being preloaded onto the container. Pricing: $0.05 per session-hour.Text editor tool
The text editor tool uses the same pricing structure as other tools used with Claude. It follows the standard input and output token pricing based on the Claude model you’re using. In addition to the base tokens, the following additional input tokens are needed for the text editor tool:Tool | Additional input tokens |
---|---|
text_editor_20250429 (Claude 4) | 700 tokens |
text_editor_20250124 (Claude Sonnet 3.7) | 700 tokens |
text_editor_20241022 (Claude Sonnet 3.5 (deprecated)) | 700 tokens |
Web search tool
Web search usage is charged in addition to token usage:Web fetch tool
Web fetch usage has no additional charges beyond standard token costs:max_content_tokens
parameter to set appropriate limits based on your use case and budget considerations.
Example token usage for typical content:
- Average web page (10KB): ~2,500 tokens
- Large documentation page (100KB): ~25,000 tokens
- Research paper PDF (500KB): ~125,000 tokens
Computer use tool
Computer use follows the standard tool use pricing. When using the computer use tool: System prompt overhead: The computer use beta adds 466-499 tokens to the system prompt Computer use tool token usage:Model | Input tokens per tool definition |
---|---|
Claude 4 / Sonnet 3.7 | 735 tokens |
Claude Sonnet 3.5 (deprecated) | 683 tokens |
- Screenshot images (see Vision pricing)
- Tool execution results returned to Claude
If you’re also using bash or text editor tools alongside computer use, those tools have their own token costs as documented in their respective pages.
Agent use case pricing examples
Understanding pricing for agent applications is crucial when building with Claude. These real-world examples can help you estimate costs for different agent patterns.Customer support agent example
When building a customer support agent, here’s how costs might break down:Example calculation for processing 10,000 support tickets:
- Average ~3,700 tokens per conversation
- Using Claude Sonnet 4 at 15/MTok output
- Total cost: ~$22.20 per 10,000 tickets
General agent workflow pricing
For more complex agent architectures with multiple steps:-
Initial request processing
- Typical input: 500-1,000 tokens
- Processing cost: ~$0.003 per request
-
Memory and context retrieval
- Retrieved context: 2,000-5,000 tokens
- Cost per retrieval: ~$0.015 per operation
-
Action planning and execution
- Planning tokens: 1,000-2,000
- Execution feedback: 500-1,000
- Combined cost: ~$0.045 per action
Cost optimization strategies
When building agents with Claude:- Use appropriate models: Choose Haiku for simple tasks, Sonnet for complex reasoning
- Implement prompt caching: Reduce costs for repeated context
- Batch operations: Use the Batch API for non-time-sensitive tasks
- Monitor usage patterns: Track token consumption to identify optimization opportunities
For high-volume agent applications, consider contacting our enterprise sales team for custom pricing arrangements.
Additional pricing considerations
Rate limits
Rate limits vary by usage tier and affect how many requests you can make:- Tier 1: Entry-level usage with basic limits
- Tier 2: Increased limits for growing applications
- Tier 3: Higher limits for established applications
- Tier 4: Maximum standard limits
- Enterprise: Custom limits available
Volume discounts
Volume discounts may be available for high-volume users. These are negotiated on a case-by-case basis.- Standard tiers use the pricing shown above
- Enterprise customers can contact sales for custom pricing
- Academic and research discounts may be available
Enterprise pricing
For enterprise customers with specific needs:- Custom rate limits
- Volume discounts
- Dedicated support
- Custom terms
Billing and payment
- Billing is calculated monthly based on actual usage
- Payments are processed in USD
- Credit card and invoicing options available
- Usage tracking available in the Claude Console