ドキュメント

Chat & LLM

Two chat endpoints: RAG chat that searches your buckets and answers with sources, and an OpenAI-compatible completions proxy for direct LLM access.

RAG Chat

POST /v1/chat searches a bucket for relevant context, then generates an answer with source citations. Supports streaming.

bashcurl -X POST https://api.schift.io/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "bucket_id": "abc123",
    "message": "What are the key findings?",
    "top_k": 5,
    "stream": true
  }'

Field	Type	Default	Description
`bucket_id`	string	—	Bucket to search for context
`message`	string	—	User question
`history`	ChatMessage[]	`[]`	Previous conversation turns (max 10)
`model`	string	`gpt-4.1-nano`	LLM model for generation
`top_k`	integer	`5`	Number of sources to retrieve
`stream`	boolean	`true`	Enable SSE streaming
`system_prompt`	string	—	Custom system prompt (overrides default)
`temperature`	float	—	Sampling temperature
`max_tokens`	integer	—	Max output tokens

Non-streaming response includes reply, sources (with id, score, text), and model. Streaming emits SSE events: sources, then chunk deltas, then done.

Chat Completions (OpenAI-compatible)

POST /v1/chat/completions is a drop-in replacement for the OpenAI chat API. It proxies to multiple LLM providers through a single endpoint.

bashcurl -X POST https://api.schift.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Explain RAG in one paragraph."}]
  }'

Field	Type	Default	Description
`model`	string	—	LLM model ID (e.g., `gpt-4.1`, `claude-sonnet-4-6`)
`messages`	object[]	—	OpenAI-format messages
`temperature`	float	—	Sampling temperature
`max_tokens`	integer	—	Max output tokens
`stream`	boolean	`false`	Enable SSE streaming
`stop`	string[]	—	Stop sequences

Returns the standard OpenAI response format. Use GET /v1/models to list all available LLM models.

When to use which

Use /v1/chat when you want Schift to search your documents and answer with sources. Use /v1/chat/completions when you want a plain LLM call without retrieval — same format as OpenAI, works with any OpenAI-compatible client library.