Skip to content

Self-hosting

Schift agents can connect to any OpenAI-compatible LLM endpoint. This lets you use local models for development, or self-hosted models in production.

Terminal window
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3
const agent = new Agent({
name: "Local Agent",
instructions: "You are a helpful assistant.",
model: "llama3",
baseUrl: "http://localhost:11434/v1",
});
const result = await agent.run("Hello!");

No API key needed. Ollama runs locally on your machine.

Terminal window
# Start vLLM server
python -m vllm.entrypoints.openai.api_server \
--model mistralai/Mistral-7B-Instruct-v0.3 \
--port 8000
const agent = new Agent({
name: "vLLM Agent",
instructions: "You are a helpful assistant.",
model: "mistralai/Mistral-7B-Instruct-v0.3",
baseUrl: "http://localhost:8000/v1",
});

LiteLLM provides an OpenAI-compatible proxy for 100+ LLM providers.

Terminal window
litellm --model ollama/llama3 --port 4000
const agent = new Agent({
name: "LiteLLM Agent",
instructions: "You are a helpful assistant.",
model: "ollama/llama3",
baseUrl: "http://localhost:4000/v1",
});

Connect directly to cloud providers without Schift Cloud routing:

const agent = new Agent({
name: "OpenAI Direct",
instructions: "...",
model: "gpt-4o-mini",
baseUrl: "https://api.openai.com/v1",
apiKey: process.env.OPENAI_API_KEY,
});

Anthropic’s API is not OpenAI-compatible. Use LiteLLM as a proxy:

Terminal window
litellm --model anthropic/claude-sonnet-4-6 --port 4000
const agent = new Agent({
model: "anthropic/claude-sonnet-4-6",
baseUrl: "http://localhost:4000/v1",
});

You can use a self-hosted LLM for the agent while still using Schift Cloud for RAG:

const schift = new Schift({ apiKey: "sch_..." });
const rag = new RAG({ bucket: "docs" }, schift.transport);
const agent = new Agent({
name: "Hybrid Agent",
instructions: "...",
rag, // RAG via Schift Cloud
model: "llama3", // LLM via Ollama
baseUrl: "http://localhost:11434/v1",
});