Authentication errors
Authentication errors
Run If the output shows missing or expired tokens, re-run the auth flow:This opens the GitHub OAuth device flow and refreshes your stored credentials. After authenticating, start the proxy again and retry.
debug to check your current authentication status:403 from Admin UI or Admin API
403 from Admin UI or Admin API
A Then include the token in every Admin API request:See Monitor usage with the Admin UI for full access control details.
403 Forbidden response means the request came from a non-loopback address and ADMIN_TOKEN is not set on the server.The Admin UI is restricted to localhost, 127.0.0.1, and ::1 by default. For remote access, set the ADMIN_TOKEN environment variable when starting the proxy:401 from Admin UI or Admin API
401 from Admin UI or Admin API
A The UI stores the token in
401 Unauthorized response means ADMIN_TOKEN is configured on the server but your request did not include the token.Pass the token using the x-admin-token header or Authorization: Bearer:sessionStorage and sends it automatically once you enter it in the Admin token dialog in the top-right corner."Prompt token count exceeds limit" errors with Claude Code
"Prompt token count exceeds limit" errors with Claude Code
Claude Code compacts conversation history based on token counts reported by the proxy. By default, the proxy estimates Claude token counts using the GPT Or set it as an environment variable:See the configuration reference for setup steps.
o200k_base tokenizer with a multiplier. This consistently underestimates actual Claude usage, causing Claude Code to compact too late and hit the context limit.Fix this by configuring an Anthropic API key. The proxy then forwards token counting requests for Claude models to Anthropic’s real /v1/messages/count_tokens endpoint, which returns exact counts. The token counting endpoint is free — you only need a minimum $5 credit balance on your Anthropic account to activate API access.Add the key to config.json:config.json
Rate limit errors
Rate limit errors
If you are receiving rate limit errors from the proxy or from GitHub Copilot, use the See Control request rate limits in Copilot API for more options.
--rate-limit and --wait flags to pace your requests:GitHub security warning or account suspension
GitHub security warning or account suspension
If you receive a security warning from GitHub or your Copilot access is temporarily suspended, you are likely sending too many automated requests too quickly.Steps to take:
- Stop the proxy and wait for any suspension to lift.
- Reduce request frequency using
--rate-limit(for example,--rate-limit 60). - Review GitHub’s Acceptable Use Policies and GitHub Copilot Terms.
- Avoid running bulk or parallel automated workloads through the proxy.
Token counting is inaccurate for Claude models
Token counting is inaccurate for Claude models
The default GPT tokenizer used for Or via environment variable:When the key is present, the proxy forwards Claude token counting requests to Anthropic’s
/v1/messages/count_tokens underestimates actual Claude token usage. This causes tools like Claude Code to compact too late and can produce context-limit errors.For exact counts, provide an Anthropic API key. Set it in config.json:config.json
/v1/messages/count_tokens endpoint (which is free to call). Non-Claude models and failures fall back to GPT tokenizer estimation automatically.Requests are using premium quota unexpectedly
Requests are using premium quota unexpectedly
Enable detailed diagnostic logs
Enable detailed diagnostic logs
For payload-level or stream-level diagnostics, set The proxy writes detailed logs to
logLevel to debug in config.json:config.json
logs/*.log under the app data directory (~/.local/share/copilot-api/ on Linux/macOS). After adding this setting, restart the proxy for it to take effect.--verbose on the command line does not enable debug-level file logging. You must set logLevel in config.json explicitly.