API Documentation
LLM API provides high-availability Claude API service using the same API format as Anthropic. Change your Base URL and start building — no other code changes required.
Base URL
https://llmapi.pro
Get Started in 3 Steps
Create an account and get your API key
Register for free, then copy your API key from the Dashboard.
Set your environment variables
export ANTHROPIC_BASE_URL=https://llmapi.pro
export ANTHROPIC_API_KEY=your-api-key
Send your first request
curl https://llmapi.pro/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello! What can you do?"}
]
}'
Claude Code Integration
Configure the Claude Code CLI to route all requests through LLM API. Set two environment variables and launch Claude Code as usual — no plugins or patches needed.
macOS / Linux
# Temporary (current shell session only)
export ANTHROPIC_BASE_URL=https://llmapi.pro
export ANTHROPIC_API_KEY=your-api-key
# Permanent (add to ~/.bashrc or ~/.zshrc)
echo 'export ANTHROPIC_BASE_URL=https://llmapi.pro' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY=your-api-key' >> ~/.zshrc
source ~/.zshrc
# Then simply launch Claude Code
claude
Windows PowerShell
# Temporary (current session only)
$env:ANTHROPIC_BASE_URL="https://llmapi.pro"
$env:ANTHROPIC_API_KEY="your-api-key"
# Permanent (user-level environment variable)
[System.Environment]::SetEnvironmentVariable("ANTHROPIC_BASE_URL", "https://llmapi.pro", "User")
[System.Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "your-api-key", "User")
# Then launch Claude Code
claude
Environment Variables
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_BASE_URL |
Yes | Set to https://llmapi.pro to proxy requests through LLM API. |
ANTHROPIC_API_KEY |
Yes | Your LLM API key (starts with sk-llmapi-). Get it from the Dashboard. |
Model Mapping
When Claude Code sends a request, LLM API transparently maps model identifiers to the best-available backend. You can use any standard Anthropic model name.
| Model Name | Optimized For | Notes |
|---|---|---|
claude-sonnet-4-6 |
Code generation & tool use | Default Claude Code model. Routed to our best coding model. |
claude-opus-4-6 |
Complex reasoning | Highest-capability model for architecture and design tasks. |
claude-haiku-4-5 |
Fast & affordable | Quick responses for simple tasks. Lowest cost per token. |
Tip: Model routing is handled automatically. You do not need to change the model parameter in Claude Code — LLM API takes care of it.
Authentication
Every API request must include a valid API key. You can pass your key using either of the two headers below. Create and manage keys in your Dashboard.
Supported Auth Headers
| Header | Format | Description |
|---|---|---|
x-api-key |
sk-llmapi-xxxx |
Primary method. Compatible with the Anthropic SDK. |
Authorization |
Bearer sk-llmapi-xxxx |
Alternative Bearer-token method. |
Example: cURL with x-api-key
curl https://llmapi.pro/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: sk-llmapi-xxxxxxxxxxxxxxxxxxxxxxxx" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Explain quantum computing in simple terms."}
]
}'
Security note: Never expose your API key in client-side code or public repositories. If a key is compromised, delete it immediately from the Dashboard and create a new one.
API Reference
/v1/messages
Create a message. Fully compatible with the Anthropic Messages API.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Model identifier, e.g. claude-sonnet-4-6 |
messages |
array | Yes | Array of message objects with role (user | assistant) and content. |
max_tokens |
integer | Yes | Maximum number of tokens to generate in the response. |
system |
string | No | System prompt that sets behavior and context for the model. |
temperature |
number | No | Sampling temperature between 0 and 1. Lower values are more deterministic. |
tools |
array | No | List of tool definitions the model may use (function calling). |
tool_choice |
object | No | Controls tool use: auto, any, or tool with a specific name. |
stream |
boolean | No | Enable Server-Sent Events streaming. Default: false. |
Non-Streaming Response
When stream is false (default), the API returns a single JSON object:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! I can help you with a wide range of tasks..."
}
],
"model": "claude-sonnet-4-6",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 58
}
}
Streaming Response (SSE)
When stream is true, the response is delivered as Server-Sent Events. Each event has an event field and a JSON data field:
event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","content":[],"model":"claude-sonnet-4-6","usage":{"input_tokens":12,"output_tokens":0}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"! I can"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":58}}
event: message_stop
data: {"type":"message_stop"}
SSE Event Types
| Event | Description |
|---|---|
message_start |
Sent once at the beginning. Contains the message object with metadata. |
content_block_start |
Marks the start of a new content block (text or tool_use). |
content_block_delta |
Incremental content update. Concatenate delta.text to build the full response. |
content_block_stop |
The content block is complete. |
message_delta |
Final message metadata including stop_reason and total usage. |
message_stop |
End of stream. Close the connection. |
Models
LLM API accepts all standard Anthropic model identifiers. Requests are routed to the optimal backend provider for reliability and performance.
| Model ID | Display Name | Best For | Context Window |
|---|---|---|---|
claude-opus-4-6 |
Claude Opus 4.6 | Complex reasoning, advanced coding, deep analysis | 200K tokens |
claude-sonnet-4-6 |
Claude Sonnet 4.6 | Balanced performance, everyday coding, general tasks | 200K tokens |
claude-haiku-4-5 |
Claude Haiku 4.5 | Fast responses, simple tasks, high-throughput workloads | 200K tokens |
All official Anthropic model identifiers are supported, including versioned aliases.
Error Handling
LLM API uses standard HTTP status codes. Error responses always return a JSON body with a machine-readable type and a human-readable message.
Error Response Format
{
"type": "error",
"error": {
"type": "authentication_error",
"message": "Invalid API key provided."
}
}
HTTP Status Codes
| Status | Error Type | Description | Recommended Action |
|---|---|---|---|
400 |
invalid_request_error |
Malformed request body or missing required parameters. | Verify your JSON payload and required fields. |
401 |
authentication_error |
Invalid or missing API key. | Check that x-api-key is set correctly. |
403 |
permission_error |
Insufficient permissions or account suspended. | Verify your account status and plan entitlements. |
429 |
rate_limit_error |
Too many requests. Rate limit exceeded. | Back off and retry. See Rate Limits. |
500 |
api_error |
Internal server error. | Retry after a brief delay. Contact support if it persists. |
503 |
overloaded_error |
Upstream provider is temporarily overloaded. | Wait a moment and retry with exponential backoff. |
Rate Limits
Rate limits vary by plan and are enforced per API key. When you exceed a limit, the API returns a 429 status code. Upgrade your plan for higher throughput.
Per-Plan Limits
| Plan | Requests / min | Requests / day | Monthly Token Quota |
|---|---|---|---|
| Free | 10 | 40 / 5h, 200 / week | Unlimited |
| Pro | 20 | 400 / 5h, 2,000 / week | Unlimited |
| Max 5x | 60 | 1,200 / 5h, 6,000 / week | Unlimited |
| Max 20x | 120 | 3,000 / 5h, 15,000 / week | Unlimited |
Rate Limit Headers
Every API response includes headers to help you track your usage in real time:
anthropic-ratelimit-requests-limit: 60
anthropic-ratelimit-requests-remaining: 58
anthropic-ratelimit-requests-reset: 2026-04-08T12:01:00.000Z
anthropic-ratelimit-tokens-limit: 800000
anthropic-ratelimit-tokens-remaining: 800000
| Header | Description |
|---|---|
x-ratelimit-limit |
Maximum number of requests allowed per minute for your plan. |
x-ratelimit-remaining |
Number of requests remaining in the current rate-limit window. |
x-ratelimit-reset |
ISO 8601 timestamp when the rate-limit window resets. |
Ready to get started?
Create a free account and send your first API request in under a minute.