Skip to main content
These endpoints mirror the OpenAI API so any tool, SDK, or library built for OpenAI works with Copilot API without modification. Point your existing OpenAI client at http://localhost:4141 and your requests go to GitHub Copilot instead.

Base URL

http://localhost:4141
Use a different base URL if you started the server with a custom --port.

Authentication

When you have API keys configured in auth.apiKeys, pass one of the following headers with every request:
HeaderFormat
x-api-keyx-api-key: <key>
AuthorizationAuthorization: Bearer <key>
If auth.apiKeys is empty, the server accepts requests without authentication.

POST /v1/chat/completions

Creates a model response for a chat conversation in OpenAI Chat Completions format. Compatible with any tool that uses the OpenAI SDK or calls the OpenAI API directly.
curl http://localhost:4141/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

POST /v1/responses

OpenAI’s most advanced interface for generating model responses (Responses API). Used by Claude Code and OpenCode for complex agent workflows, including multi-turn tool use and session continuations. When useResponsesApiWebSearch is enabled in your config.json (the default), requests that include a web_search tool are forwarded upstream with the tool intact. Set useResponsesApiWebSearch: false to strip web search tools before sending.
curl http://localhost:4141/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": [{"role": "user", "content": "Explain async/await in JavaScript."}]
  }'

GET /v1/models

Lists all models available from your Copilot account in OpenAI format.
curl http://localhost:4141/v1/models
Example response:
{
  "object": "list",
  "data": [
    { "id": "gpt-5.4", "object": "model" },
    { "id": "gpt-5-mini", "object": "model" },
    { "id": "claude-opus-4-6", "object": "model" }
  ]
}
Actual model availability depends on your GitHub Copilot subscription tier. The list returned reflects what your account can access at the time of the request.

POST /v1/embeddings

Creates an embedding vector representing the input text.
curl http://localhost:4141/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "The quick brown fox jumps over the lazy dog"
  }'
Example response:
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0098, ...]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}