API Documentation

LLM API provides high-availability Claude API service using the same API format as Anthropic. Change your Base URL and start building — no other code changes required.

Base URL

https://llmapi.pro

Get Started in 3 Steps

1

Create an account and get your API key

Register for free, then copy your API key from the Dashboard.

2

Set your environment variables

export ANTHROPIC_BASE_URL=https://llmapi.pro
export ANTHROPIC_API_KEY=your-api-key
3

Send your first request

curl https://llmapi.pro/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello! What can you do?"}
    ]
  }'

Claude Code Integration

Configure the Claude Code CLI to route all requests through LLM API. Set two environment variables and launch Claude Code as usual — no plugins or patches needed.

macOS / Linux

# Temporary (current shell session only)
export ANTHROPIC_BASE_URL=https://llmapi.pro
export ANTHROPIC_API_KEY=your-api-key

# Permanent (add to ~/.bashrc or ~/.zshrc)
echo 'export ANTHROPIC_BASE_URL=https://llmapi.pro' >> ~/.zshrc
echo 'export ANTHROPIC_API_KEY=your-api-key' >> ~/.zshrc
source ~/.zshrc

# Then simply launch Claude Code
claude

Windows PowerShell

# Temporary (current session only)
$env:ANTHROPIC_BASE_URL="https://llmapi.pro"
$env:ANTHROPIC_API_KEY="your-api-key"

# Permanent (user-level environment variable)
[System.Environment]::SetEnvironmentVariable("ANTHROPIC_BASE_URL", "https://llmapi.pro", "User")
[System.Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "your-api-key", "User")

# Then launch Claude Code
claude

Environment Variables

Variable Required Description
ANTHROPIC_BASE_URL Yes Set to https://llmapi.pro to proxy requests through LLM API.
ANTHROPIC_API_KEY Yes Your LLM API key (starts with sk-llmapi-). Get it from the Dashboard.

Model Mapping

When Claude Code sends a request, LLM API transparently maps model identifiers to the best-available backend. You can use any standard Anthropic model name.

Model Name Optimized For Notes
claude-sonnet-4-6 Code generation & tool use Default Claude Code model. Routed to our best coding model.
claude-opus-4-6 Complex reasoning Highest-capability model for architecture and design tasks.
claude-haiku-4-5 Fast & affordable Quick responses for simple tasks. Lowest cost per token.

Tip: Model routing is handled automatically. You do not need to change the model parameter in Claude Code — LLM API takes care of it.

Authentication

Every API request must include a valid API key. You can pass your key using either of the two headers below. Create and manage keys in your Dashboard.

Supported Auth Headers

Header Format Description
x-api-key sk-llmapi-xxxx Primary method. Compatible with the Anthropic SDK.
Authorization Bearer sk-llmapi-xxxx Alternative Bearer-token method.

Example: cURL with x-api-key

curl https://llmapi.pro/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk-llmapi-xxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
  }'

Security note: Never expose your API key in client-side code or public repositories. If a key is compromised, delete it immediately from the Dashboard and create a new one.

API Reference

POST /v1/messages

Create a message. Fully compatible with the Anthropic Messages API.

Request Parameters

Parameter Type Required Description
model string Yes Model identifier, e.g. claude-sonnet-4-6
messages array Yes Array of message objects with role (user | assistant) and content.
max_tokens integer Yes Maximum number of tokens to generate in the response.
system string No System prompt that sets behavior and context for the model.
temperature number No Sampling temperature between 0 and 1. Lower values are more deterministic.
tools array No List of tool definitions the model may use (function calling).
tool_choice object No Controls tool use: auto, any, or tool with a specific name.
stream boolean No Enable Server-Sent Events streaming. Default: false.

Non-Streaming Response

When stream is false (default), the API returns a single JSON object:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! I can help you with a wide range of tasks..."
    }
  ],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 58
  }
}

Streaming Response (SSE)

When stream is true, the response is delivered as Server-Sent Events. Each event has an event field and a JSON data field:

event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","content":[],"model":"claude-sonnet-4-6","usage":{"input_tokens":12,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"! I can"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":58}}

event: message_stop
data: {"type":"message_stop"}

SSE Event Types

Event Description
message_start Sent once at the beginning. Contains the message object with metadata.
content_block_start Marks the start of a new content block (text or tool_use).
content_block_delta Incremental content update. Concatenate delta.text to build the full response.
content_block_stop The content block is complete.
message_delta Final message metadata including stop_reason and total usage.
message_stop End of stream. Close the connection.

Models

LLM API accepts all standard Anthropic model identifiers. Requests are routed to the optimal backend provider for reliability and performance.

Model ID Display Name Best For Context Window
claude-opus-4-6 Claude Opus 4.6 Complex reasoning, advanced coding, deep analysis 200K tokens
claude-sonnet-4-6 Claude Sonnet 4.6 Balanced performance, everyday coding, general tasks 200K tokens
claude-haiku-4-5 Claude Haiku 4.5 Fast responses, simple tasks, high-throughput workloads 200K tokens

All official Anthropic model identifiers are supported, including versioned aliases.

Error Handling

LLM API uses standard HTTP status codes. Error responses always return a JSON body with a machine-readable type and a human-readable message.

Error Response Format

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key provided."
  }
}

HTTP Status Codes

Status Error Type Description Recommended Action
400 invalid_request_error Malformed request body or missing required parameters. Verify your JSON payload and required fields.
401 authentication_error Invalid or missing API key. Check that x-api-key is set correctly.
403 permission_error Insufficient permissions or account suspended. Verify your account status and plan entitlements.
429 rate_limit_error Too many requests. Rate limit exceeded. Back off and retry. See Rate Limits.
500 api_error Internal server error. Retry after a brief delay. Contact support if it persists.
503 overloaded_error Upstream provider is temporarily overloaded. Wait a moment and retry with exponential backoff.

Rate Limits

Rate limits vary by plan and are enforced per API key. When you exceed a limit, the API returns a 429 status code. Upgrade your plan for higher throughput.

Per-Plan Limits

Plan Requests / min Requests / day Monthly Token Quota
Free 10 40 / 5h, 200 / week Unlimited
Pro 20 400 / 5h, 2,000 / week Unlimited
Max 5x 60 1,200 / 5h, 6,000 / week Unlimited
Max 20x 120 3,000 / 5h, 15,000 / week Unlimited

Rate Limit Headers

Every API response includes headers to help you track your usage in real time:

anthropic-ratelimit-requests-limit: 60
anthropic-ratelimit-requests-remaining: 58
anthropic-ratelimit-requests-reset: 2026-04-08T12:01:00.000Z
anthropic-ratelimit-tokens-limit: 800000
anthropic-ratelimit-tokens-remaining: 800000
Header Description
x-ratelimit-limit Maximum number of requests allowed per minute for your plan.
x-ratelimit-remaining Number of requests remaining in the current rate-limit window.
x-ratelimit-reset ISO 8601 timestamp when the rate-limit window resets.

Ready to get started?

Create a free account and send your first API request in under a minute.