Skip to main content
These endpoints implement the Anthropic Messages API so tools like Claude Code and OpenCode can talk to GitHub Copilot natively, without going through an OpenAI compatibility shim. Using the Messages API path preserves Claude-native tool use semantics, supports Anthropic beta features, and reduces unnecessary premium request consumption.

Base URL

http://localhost:4141

POST /v1/messages

Creates a model response using the Anthropic Messages format. For Claude-family models, the proxy prefers Copilot’s native Messages endpoint over Chat Completions or Responses when available. The proxy filters and forwards supported anthropic-beta header values on the native Messages path. The following beta features are supported:
  • interleaved-thinking-2025-05-14
  • advanced-tool-use-2025-11-20
  • context-management-2025-06-27
When you pass a thinking budget without specifying a beta header, the proxy adds interleaved-thinking automatically for non-adaptive extended thinking.
curl http://localhost:4141/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: your_key" \
  -d '{
    "model": "claude-opus-4-6",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Use claude-opus-4-6 (with a hyphen) rather than claude-opus-4.6[1m] as the model ID. Including the [1m] suffix requests a context window that exceeds GitHub Copilot’s limit and may cause errors or account restrictions.

POST /v1/messages/count_tokens

Calculates the number of tokens for a given set of messages without generating a response. When anthropicApiKey is set in your config.json (or via the ANTHROPIC_API_KEY environment variable), the proxy forwards Claude model requests to Anthropic’s real token counting endpoint, which returns exact counts. Without a key, it falls back to a GPT o200k_base tokenizer estimate with a 1.15× multiplier.
curl http://localhost:4141/v1/messages/count_tokens \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Anthropic’s /v1/messages/count_tokens endpoint is free — there is no per-token cost. It is rate-limited to 100 requests per minute at Tier 1. You need a minimum $5 credit balance on your Anthropic account to activate an API key, but the token counting calls themselves cost nothing.

POST /:provider/v1/messages

Proxies Anthropic Messages API calls to a custom upstream provider configured in your config.json under the providers key. Each provider key you define becomes a URL prefix. For example, with this provider config:
{
  "providers": {
    "custom": {
      "type": "anthropic",
      "baseUrl": "https://your-provider.example",
      "apiKey": "sk-your-key"
    }
  }
}
You can send requests to:
curl http://localhost:4141/custom/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.5",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
The proxy translates the request using the provider’s credentials and forwards it to https://your-provider.example/v1/messages.

GET /:provider/v1/models

Lists models from a configured upstream provider. Using the same provider key from your config, send a GET request to /:provider/v1/models.
curl http://localhost:4141/custom/v1/models

POST /:provider/v1/messages/count_tokens

Calculates token counts locally for provider route requests. Unlike /v1/messages/count_tokens, this route does not forward to Anthropic — it always uses the local tokenizer.
curl http://localhost:4141/custom/v1/messages/count_tokens \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'