Skip to main content
If something is not working as expected, start with the built-in diagnostic commands and then work through the relevant section below.
# Show version, runtime info, file paths, and authentication status
npx @nick3/copilot-api@latest debug

# Show debug output as JSON
npx @nick3/copilot-api@latest debug --json

# Show current Copilot usage and quota (no server required)
npx @nick3/copilot-api@latest check-usage
Run debug to check your current authentication status:
npx @nick3/copilot-api@latest debug
If the output shows missing or expired tokens, re-run the auth flow:
npx @nick3/copilot-api@latest auth
This opens the GitHub OAuth device flow and refreshes your stored credentials. After authenticating, start the proxy again and retry.
A 403 Forbidden response means the request came from a non-loopback address and ADMIN_TOKEN is not set on the server.The Admin UI is restricted to localhost, 127.0.0.1, and ::1 by default. For remote access, set the ADMIN_TOKEN environment variable when starting the proxy:
ADMIN_TOKEN=your_admin_token_here npx @nick3/copilot-api@latest start
Then include the token in every Admin API request:
curl -H "x-admin-token: your_admin_token_here" \
  "http://your-host:4141/api/admin/meta"
See Monitor usage with the Admin UI for full access control details.
A 401 Unauthorized response means ADMIN_TOKEN is configured on the server but your request did not include the token.Pass the token using the x-admin-token header or Authorization: Bearer:
curl -H "x-admin-token: your_admin_token_here" \
  "http://localhost:4141/api/admin/accounts"
The UI stores the token in sessionStorage and sends it automatically once you enter it in the Admin token dialog in the top-right corner.
Claude Code compacts conversation history based on token counts reported by the proxy. By default, the proxy estimates Claude token counts using the GPT o200k_base tokenizer with a multiplier. This consistently underestimates actual Claude usage, causing Claude Code to compact too late and hit the context limit.Fix this by configuring an Anthropic API key. The proxy then forwards token counting requests for Claude models to Anthropic’s real /v1/messages/count_tokens endpoint, which returns exact counts. The token counting endpoint is free — you only need a minimum $5 credit balance on your Anthropic account to activate API access.Add the key to config.json:
config.json
{
  "anthropicApiKey": "sk-ant-..."
}
Or set it as an environment variable:
ANTHROPIC_API_KEY=sk-ant-... npx @nick3/copilot-api@latest start
See the configuration reference for setup steps.
If you are receiving rate limit errors from the proxy or from GitHub Copilot, use the --rate-limit and --wait flags to pace your requests:
# Enforce a 30-second gap between requests
npx @nick3/copilot-api@latest start --rate-limit 30

# Wait instead of returning an error when the cooldown is active
npx @nick3/copilot-api@latest start --rate-limit 30 --wait
See Control request rate limits in Copilot API for more options.
If you receive a security warning from GitHub or your Copilot access is temporarily suspended, you are likely sending too many automated requests too quickly.Steps to take:
  1. Stop the proxy and wait for any suspension to lift.
  2. Reduce request frequency using --rate-limit (for example, --rate-limit 60).
  3. Review GitHub’s Acceptable Use Policies and GitHub Copilot Terms.
  4. Avoid running bulk or parallel automated workloads through the proxy.
The default GPT tokenizer used for /v1/messages/count_tokens underestimates actual Claude token usage. This causes tools like Claude Code to compact too late and can produce context-limit errors.For exact counts, provide an Anthropic API key. Set it in config.json:
config.json
{
  "anthropicApiKey": "sk-ant-..."
}
Or via environment variable:
ANTHROPIC_API_KEY=sk-ant-...
When the key is present, the proxy forwards Claude token counting requests to Anthropic’s /v1/messages/count_tokens endpoint (which is free to call). Non-Claude models and failures fall back to GPT tokenizer estimation automatically.
If requests you expect to be free are consuming premium quota, check two settings in config.json:
  1. Confirm compactUseSmallModel is true (the default). When enabled, compact and background requests from Claude Code or OpenCode are routed to smallModel instead of your premium model.
  2. Confirm smallModel points to a free-tier model such as gpt-5-mini.
config.json
{
  "smallModel": "gpt-5-mini",
  "compactUseSmallModel": true
}
You can verify which model handled each request using the Admin UI Requests view and inspecting the upstream_model field.
For payload-level or stream-level diagnostics, set logLevel to debug in config.json:
config.json
{
  "logLevel": "debug"
}
The proxy writes detailed logs to logs/*.log under the app data directory (~/.local/share/copilot-api/ on Linux/macOS). After adding this setting, restart the proxy for it to take effect.
--verbose on the command line does not enable debug-level file logging. You must set logLevel in config.json explicitly.