Self-hosting

Overview

Schift agents can connect to any OpenAI-compatible LLM endpoint. This lets you use local models for development, or self-hosted models in production.

Ollama

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3

const agent = new Agent({
  name: "Local Agent",
  instructions: "You are a helpful assistant.",
  model: "llama3",
  baseUrl: "http://localhost:11434/v1",
});

const result = await agent.run("Hello!");

No API key needed. Ollama runs locally on your machine.

vLLM

# Start vLLM server
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.3 \
  --port 8000

const agent = new Agent({
  name: "vLLM Agent",
  instructions: "You are a helpful assistant.",
  model: "mistralai/Mistral-7B-Instruct-v0.3",
  baseUrl: "http://localhost:8000/v1",
});

LiteLLM

LiteLLM provides an OpenAI-compatible proxy for 100+ LLM providers.

litellm --model ollama/llama3 --port 4000

const agent = new Agent({
  name: "LiteLLM Agent",
  instructions: "You are a helpful assistant.",
  model: "ollama/llama3",
  baseUrl: "http://localhost:4000/v1",
});

Direct Provider Endpoints

Connect directly to cloud providers without Schift Cloud routing:

OpenAI

const agent = new Agent({
  name: "OpenAI Direct",
  instructions: "...",
  model: "gpt-4o-mini",
  baseUrl: "https://api.openai.com/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

Anthropic (via proxy)

Anthropic’s API is not OpenAI-compatible. Use LiteLLM as a proxy:

litellm --model anthropic/claude-sonnet-4-6 --port 4000

const agent = new Agent({
  model: "anthropic/claude-sonnet-4-6",
  baseUrl: "http://localhost:4000/v1",
});

Mixing Modes

You can use a self-hosted LLM for the agent while still using Schift Cloud for RAG:

const schift = new Schift({ apiKey: "sch_..." });
const rag = new RAG({ bucket: "docs" }, schift.transport);

const agent = new Agent({
  name: "Hybrid Agent",
  instructions: "...",
  rag,                                      // RAG via Schift Cloud
  model: "llama3",                          // LLM via Ollama
  baseUrl: "http://localhost:11434/v1",
});