API Documentation

Overview

kimrel.com provides an independent, OpenAI- and Anthropic-compatible API interface for accessing Kimi K2 and related models. Kimi K2 is developed by Moonshot AI, and kimrel.com is not affiliated with, endorsed by, or sponsored by Moonshot AI.

Base URL

https://kimrel.com/api

Supported Protocols

  • REST API over HTTPS
  • JSON request and response bodies
  • UTF-8 character encoding
  • CORS support for browser-based applications

Quick Start

Get started with the kimrel.com API in three steps:

  1. Create an account and receive 10 free credits
  2. Generate an API key from your dashboard
  3. Make your first request (most models cost 1 credit per request; kimi-k2.5 costs 2 credits)

Example Request

curl https://kimrel.com/api/v1/chat/completions \
  -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2-0905",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

AI Coding Tools

kimrel.com exposes both OpenAI-compatible (chat completions) and Claude-compatible (messages) endpoints, so you can plug supported models into popular AI coding assistants with minimal changes.

Model Support: kimi-k2, kimi-k2-0905, kimi-k2-thinking, kimi-k2.5, and kimi-k2.6 are exposed through kimrel.com. Kimi K2 is developed by Moonshot AI. kimrel.com is an independent interface and is not affiliated with, endorsed by, or sponsored by Moonshot AI. kimi-k2.6 is the recommended route for text + image understanding, thinking mode, and tool-enabled workflows; kimi-k2.5 remains a strong multimodal option, and kimi-k2 is still the faster route for simpler Python/JavaScript workflows.

Claude Code reads provider settings from ~/.claude/settings.json. Add or update the env section:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://kimrel.com/api",
    "ANTHROPIC_AUTH_TOKEN": "<KIMREL_API_KEY>",
    "API_TIMEOUT_MS": "3000000",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
    "ANTHROPIC_MODEL": "kimi-k2.5",
    "ANTHROPIC_SMALL_FAST_MODEL": "kimi-k2",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "kimi-k2"
  }
}

Restart Claude Code after saving. Requests will be routed to https://kimrel.com/api/v1/messages.

Codex

Install Codex CLI:

npm i -g @openai/codex

Add a kimrel.com provider to ~/.codex/config.toml:

[model_providers.kimi-k2]
name = "kimrel.com Chat Completions API"
base_url = "https://kimrel.com/api/v1"
env_key = "KIMREL_API_KEY"
wire_api = "chat"
requires_openai_auth = false
request_max_retries = 4
stream_max_retries = 10
stream_idle_timeout_ms = 300000

[profiles.k2]
model = "kimi-k2"
model_provider = "kimi-k2"

[profiles.k2.5]
model = "kimi-k2.5"
model_provider = "kimi-k2"

Set your API key and run Codex:

export KIMREL_API_KEY="<YOUR_API_KEY>"

codex --profile k2.5

Authentication

API Keys

Authentication is performed using API keys. Include your API key in the request header:

Authorization: Bearer YOUR_API_KEY

Or for Anthropic-compatible endpoints:

X-API-Key: YOUR_API_KEY

Authentication Methods

MethodHeaderFormatEndpoints
Bearer TokenAuthorizationBearer YOUR_API_KEY/v1/chat/completions
API KeyX-API-KeyYOUR_API_KEY/v1/messages

API Reference

List Models

List all available models that can be used with the API.

List Available Models

GET /v1/models

Returns a list of models available for use with the API.

Response Format

{
  "object": "list",
  "data": [
    {
      "id": "kimi-k2",
      "object": "model",
      "created": 1735785600,
      "owned_by": "moonshot-ai",
      "permission": [...],
      "root": "kimi-k2",
      "parent": null
    },
    {
      "id": "kimi-k2-0905",
      "object": "model",
      "created": 1735785600,
      "owned_by": "moonshot-ai",
      "permission": [...],
      "root": "kimi-k2-0905",
      "parent": null
    },
    {
      "id": "kimi-k2-thinking",
      "object": "model",
      "created": 1735785600,
      "owned_by": "moonshot-ai",
      "permission": [...],
      "root": "kimi-k2-thinking",
      "parent": null
    }
  ]
}
Response Fields
FieldTypeDescription
objectstringAlways list
dataarrayList of available models
data[].idstringModel identifier to use in API requests
data[].objectstringAlways model
data[].owned_bystringOrganization that owns the model

Chat Completions

The Chat Completions API generates model responses for conversations. This endpoint is compatible with OpenAI's API format.

Create Completion

POST /v1/chat/completions

Generates a model response for the given conversation.

Request Format

{
  "model": "kimi-k2-0905",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user", 
      "content": "Explain quantum computing"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2048,
  "top_p": 1.0,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "stream": false,
  "n": 1
}
Parameters
ParameterTypeRequiredDefaultDescription
modelstringYes-Model identifier. Use kimi-k2
messagesarrayYes-Input messages. Each message has a role and content
temperaturenumberNo0.6Sampling temperature between 0 and 2. Lower values make output more deterministic
max_tokensintegerNo1024Maximum tokens to generate. Model maximum is 128000
top_pnumberNo1.0Nucleus sampling threshold. Alternative to temperature
frequency_penaltynumberNo0Penalize repeated tokens. Range: -2.0 to 2.0
presence_penaltynumberNo0Penalize tokens based on presence. Range: -2.0 to 2.0
streambooleanNofalseStream responses incrementally
nintegerNo1Number of completions to generate
stopstring/arrayNonullStop sequences (up to 4)
userstringNonullUnique identifier for end-user tracking
Message Object
FieldTypeDescription
rolestringOne of: system, user, assistant
contentstringMessage content

Response Format

{
  "id": "chatcmpl-9d4c2f68-5e3a-4b2f-a3c9-7d8e6f5c4b3a",
  "object": "chat.completion",
  "created": 1709125200,
  "model": "kimi-k2-0905",
  "system_fingerprint": "fp_a7c4d3e2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing leverages quantum mechanical phenomena..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 189,
    "total_tokens": 214
  }
}
Response Fields
FieldTypeDescription
idstringUnique request identifier
objectstringObject type: chat.completion
createdintegerUnix timestamp
modelstringModel used
choicesarrayGenerated completions
usageobjectToken usage statistics
Finish Reasons
ValueDescription
stopNatural end of message or stop sequence reached
lengthMaximum token limit reached

Streaming

Server-sent events format when stream: true:

data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"id":"chatcmpl-...","choices":[{"delta":{"content":" there"},"index":0}]}

data: [DONE]

Messages

The Messages API provides Anthropic-compatible message generation.

Create Message

POST /v1/messages

Creates a model response using the Messages format.

Request Format

{
  "model": "kimi-k2-0905",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "max_tokens": 1024,
  "system": "You are a knowledgeable geography assistant.",
  "temperature": 0.7,
  "top_p": 1.0,
  "stop_sequences": ["\n\nHuman:"]
}
Parameters
ParameterTypeRequiredDefaultDescription
modelstringYes-Model identifier
messagesarrayYes-Conversation messages (user/assistant only)
max_tokensintegerYes-Maximum tokens to generate
systemstringNonullSystem prompt for behavior guidance
temperaturenumberNo0.6Sampling temperature (0-1)
top_pnumberNo1.0Nucleus sampling threshold
stop_sequencesarrayNonullStop generation sequences (max 4)
streambooleanNofalseEnable streaming responses
metadataobjectNonullRequest metadata

Response Format

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "model": "kimi-k2-0905",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 15,
    "output_tokens": 9
  }
}
Response Fields
FieldTypeDescription
idstringUnique message identifier
typestringObject type: message
rolestringAlways assistant
contentarrayMessage content blocks
modelstringModel used
stop_reasonstringWhy generation stopped
usageobjectToken usage

System Prompts

System prompts in the Messages API are specified separately:

{
  "system": "You are Claude, an AI assistant created by Anthropic.",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "max_tokens": 1024
}

Models

The model IDs below are exposed by kimrel.com for supported model routes. Kimi K2 family models are developed by Moonshot AI; kimrel.com provides an independent access layer rather than the model developer's official product branding.

Available Models

Model IDContext WindowDescription
kimi-k2128,000 tokensPrimary model for general chat completions
kimi-k2-0905256,000 tokensEnhanced model with extended context window
kimi-k2-thinking256,000 tokensSpecialized model for deep reasoning, mathematical proofs, research analysis, and multi-step problem solving
kimi-k2.5256,000 tokensNative multimodal agentic MoE model (1T total / 32B active) with strong tool use and reasoning
kimi-k2.6256,000 tokensNewest K2.6 route for text + image input, thinking mode, and tool-enabled workflows on kimrel.com

Model Selection

Choose the appropriate model based on your use case:

  • kimi-k2: Best for general conversational AI, content generation, and standard tasks
  • kimi-k2-0905: Ideal for tasks requiring longer context (up to 256K tokens), such as analyzing entire documents or long conversations
  • kimi-k2-thinking: Optimized for complex reasoning tasks:
    • Mathematical proofs and competition-level math problems
    • Research analysis and literature review
    • Multi-step problem solving requiring logical reasoning
    • Advanced tool orchestration (200-300 sequential tool calls)
    • Frontend development with complex UI requirements
    • Agentic search tasks requiring autonomous navigation
  • kimi-k2.5: Native multimodal, agentic model optimized for tool use and reasoning with a 256K context window
  • kimi-k2.6: Recommended when you need the newest K2.6 route on kimrel.com with text + image understanding, thinking mode, and tool calling. This service currently supports image inputs for K2.6, but does not support video inputs.

Kimi-K2.5 key specs: MoE architecture (1T total parameters, 32B active), 256K context, and a MoonViT vision encoder.

The Thinking model shows its reasoning process step-by-step, making it ideal for educational contexts and applications where transparency is important.

Kimi K2.6 Quick Start

kimi-k2.6 is the newest K2.6 route exposed by kimrel.com. It is the best choice on this service when you want stronger reasoning, image understanding, and tool-enabled workflows without changing your existing OpenAI-compatible request structure. On kimrel.com, K2.6 supports a 256K context window, text + image inputs, tool calling, and thinking mode on the OpenAI-compatible /v1/chat/completions endpoint. For this service, video inputs are not supported.

When you send images to kimi-k2.6, you can either provide a data:image/...;base64,... URL directly or provide a remote http(s) image URL. kimrel.com will fetch the remote image, validate it, convert it to base64, and then forward the request upstream. This makes it easier to integrate screenshot understanding and image reasoning from existing pipelines without rewriting your client code.

OpenAI-compatible K2.6 with thinking

{
  "model": "kimi-k2.6",
  "messages": [
    {
      "role": "user",
      "content": "Think carefully about the trade-offs, then summarize the best migration plan."
    }
  ],
  "thinking": { "type": "enabled" },
  "max_completion_tokens": 2048
}

OpenAI-compatible K2.6 image understanding

{
  "model": "kimi-k2.6",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/demo-ui.png"
          }
        },
        {
          "type": "text",
          "text": "Describe the image and list the main UI elements."
        }
      ]
    }
  ],
  "max_completion_tokens": 2048
}

For kimi-k2.6, kimrel.com accepts remote http(s) image URLs and converts them to base64 before forwarding the request. If you already have a data:image/...;base64,... URL, you can send it directly. Video URLs are rejected by this service.

Image input limits on Kimrel:

  • Maximum original image size: 6MB
  • Maximum encoded image payload after conversion: 8MB
  • Supported image types: PNG, JPEG/JPG, WEBP, GIF
  • Remote image URLs must use http(s) and are fetched server-side before forwarding
  • Video inputs are not supported

Anthropic-compatible K2.6 image input

{
  "model": "kimi-k2.6",
  "max_tokens": 2048,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "url",
            "url": "https://example.com/demo-ui.png"
          }
        },
        {
          "type": "text",
          "text": "Describe the image content."
        }
      ]
    }
  ]
}

The Anthropic-compatible /v1/messages endpoint also supports existing base64 image blocks. Remote image URL fetching on this endpoint is currently limited to kimi-k2.6, matching the service behavior implemented for the new model route.

K2.6 tool calling example

{
  "model": "kimi-k2.6",
  "messages": [
    {
      "role": "user",
      "content": "Use the weather tool to compare Beijing and Shanghai."
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          },
          "required": ["city"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "max_completion_tokens": 1024
}

Request Limits

Rate Limits

Rate limits are applied per API key based on credit balance:

Credit BalanceRequests/MinuteRequests/HourRequests/Day
1-100206005,000
101-1,000602,00020,000
1,001-10,0002006,00050,000
10,000+50015,000100,000

Rate limit headers:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1709125800

Token Limits

Limit TypeValue
Maximum input tokens128,000
Maximum output tokens8,192
Maximum total tokens128,000

Timeout Settings

Timeout TypeDuration
Connection timeout30 seconds
Read timeout600 seconds
Stream timeout600 seconds

Error Codes

HTTP Status Codes

StatusMeaning
200Success
400Bad Request - Invalid parameters
401Unauthorized - Invalid or missing API key
403Forbidden - Insufficient credits or permissions
404Not Found - Invalid endpoint
429Too Many Requests - Rate limit exceeded
500Internal Server Error
503Service Unavailable

Error Types

OpenAI Format Errors

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
Error CodeTypeDescription
invalid_api_keyinvalid_request_errorAPI key is invalid or malformed
insufficient_creditsinsufficient_quotaCredit balance is insufficient
rate_limit_exceededrate_limit_errorToo many requests
invalid_requestinvalid_request_errorRequest validation failed
model_not_foundinvalid_request_errorSpecified model doesn't exist
context_length_exceededinvalid_request_errorInput exceeds context window
encoded_image_too_largeinvalid_request_errorEncoded image payload exceeds the service forwarding limit

Anthropic Format Errors

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key"
  }
}
Error TypeDescription
authentication_errorAuthentication failed
invalid_request_errorRequest validation failed
rate_limit_errorRate limit exceeded
api_errorServer-side error

Error Handling

Implement exponential backoff with jitter for retries:

import time
import random

def retry_with_backoff(
    func, 
    max_retries=3,
    base_delay=1,
    max_delay=60
):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            
            delay = min(
                base_delay * (2 ** attempt) + random.uniform(0, 1),
                max_delay
            )
            time.sleep(delay)

Client Libraries

Python

Installation

pip install openai
# or
pip install anthropic

OpenAI Client

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://kimrel.com/api/v1"
)

# List available models
models = client.models.list()
for model in models.data:
    print(f"Model ID: {model.id}")

# Create chat completion
response = client.chat.completions.create(
    model="kimi-k2",
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)

Anthropic Client

from anthropic import Anthropic

client = Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://kimrel.com/api/v1"
)

response = client.messages.create(
    model="kimi-k2",
    messages=[
        {"role": "user", "content": "Hello"}
    ],
    max_tokens=1024
)

Node.js

Installation

npm install openai
# or
npm install @anthropic-ai/sdk

OpenAI Client

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.KIMI_API_KEY,
  baseURL: 'https://kimrel.com/api/v1',
});

// List available models
const models = await openai.models.list();
for (const model of models.data) {
  console.log(`Model ID: ${model.id}`);
}

// Create chat completion
const response = await openai.chat.completions.create({
  model: 'kimi-k2-0905',
  messages: [{ role: 'user', content: 'Hello' }],
});

Anthropic Client

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.KIMI_API_KEY,
  baseURL: 'https://kimrel.com/api/v1',
});

const response = await anthropic.messages.create({
  model: 'kimi-k2-0905',
  messages: [{ role: 'user', content: 'Hello' }],
  max_tokens: 1024,
});

Go

Installation

go get github.com/sashabaranov/go-openai

Example

package main

import (
    "context"
    "fmt"
    openai "github.com/sashabaranov/go-openai"
)

func main() {
    config := openai.DefaultConfig("YOUR_API_KEY")
    config.BaseURL = "https://kimrel.com/api/v1"
    
    client := openai.NewClientWithConfig(config)
    
    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "kimi-k2",
            Messages: []openai.ChatCompletionMessage{
                {
                    Role:    openai.ChatMessageRoleUser,
                    Content: "Hello",
                },
            },
        },
    )
    
    if err != nil {
        panic(err)
    }
    
    fmt.Println(resp.Choices[0].Message.Content)
}

REST API

Direct HTTP requests without client libraries:

cURL

curl -X POST https://kimrel.com/api/v1/chat/completions \
  -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2",
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Python (requests)

import requests

response = requests.post(
    "https://kimrel.com/api/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    },
    json={
        "model": "kimi-k2",
        "messages": [{"role": "user", "content": "Hello"}]
    }
)

Node.js (fetch)

const response = await fetch('https://kimrel.com/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'kimi-k2-0905',
    messages: [{ role: 'user', content: 'Hello' }],
  }),
});

Billing

Credit System

API usage is billed through a credit system:

  • Most models: 1 credit per API request
  • kimi-k2.5: 2 credits per API request
  • kimi-k2.6: 3 credits per API request
  • Credits are deducted upon successful completion
  • Failed requests (4xx errors) are not charged
  • Server errors (5xx) are not charged
  • New users receive 10 free credits upon registration
  • Invite rewards:
    • 50 credits when someone registers with your invite code
    • 500 credits when an invited user makes their first payment

Credit Packages

PackageCreditsPricePer CreditValidity
Starter500$4.99$0.0099No expiration
Standard5,000$29.99$0.00601 month
Premium20,000$59.99$0.00301 month
EnterpriseCustomContact salesCustomCustom

Usage Tracking

Monitor your usage through:

  1. Response headers: X-Credits-Remaining: 4523
  2. Dashboard: Real-time usage statistics at /my-credits
  3. API endpoint: GET /api/user/credits

Usage data includes:

  • Total credits consumed
  • Credits remaining
  • Usage by day/hour
  • Average tokens per request

Migration Guide

From OpenAI

Migrating from OpenAI API requires minimal changes:

  1. Update base URL:

    # From
    client = OpenAI(api_key="sk-...")
    
    # To
    client = OpenAI(
        api_key="sk-...",
        base_url="https://kimrel.com/api/v1"
    )
    
  2. Update model name:

    # From
    model="gpt-4"
    
    # To
    model="kimi-k2-0905"
    
  3. No other changes required - The API is fully compatible

From Anthropic

Migrating from Anthropic API:

  1. Update base URL:

    # From
    client = Anthropic(api_key="sk-ant-...")
    
    # To
    client = Anthropic(
        api_key="sk-...",
        base_url="https://kimrel.com/api/v1"
    )
    
  2. Update authentication:

    • Generate API key from Kimi K2 dashboard
    • Replace Anthropic API key
  3. Model compatibility:

    • Kimi K2 is supported

Changelog

2025-11-10

  • Added kimi-k2-thinking model
  • Specialized model for complex reasoning tasks
  • Step-by-step reasoning process display
  • Support for mathematical proofs, research analysis, and multi-step problem solving
  • Advanced tool orchestration capabilities (200-300 sequential tool calls)

2025-09-05

  • 256K context window support
  • kimi-k2-0905 model support

2025-01-30

  • Added Anthropic Messages API compatibility
  • Introduced X-API-Key authentication method
  • Enhanced error response formats

2025-01-15

  • Initial API release
  • OpenAI Chat Completions compatibility
  • 128K context window support
  • Credit-based billing system