Embeddings
Schift provides a unified embedding API that proxies to multiple providers (OpenAI, Google, Voyage, Cloudflare, HuggingFace, and more) through a single set of endpoints. All embeddings pass through Schift’s Canonical Projection layer, so you can switch models or reduce dimensions without re-embedding your data.
Authentication
Section titled “Authentication”All embedding endpoints require a Bearer token in the Authorization header:
Authorization: Bearer sch_xxxxxxxxxxxxxxxxxxxxAPI keys are managed in the Schift dashboard under Settings > API Keys.
Supported models
Section titled “Supported models”| Model | Provider | Default dimension | Max tokens | Variable dimensions |
|---|---|---|---|---|
schift-embed-1-small | Schift (auto) | 1024 | 8,192 | Yes |
openai/text-embedding-3-small | OpenAI | 1536 | 8,191 | Yes |
openai/text-embedding-3-large | OpenAI | 3072 | 8,191 | Yes |
google/gemini-embedding-001 | 3072 | 8,191 | Yes | |
google/gemini-embedding-002 | 3072 | 8,191 | Yes | |
voyage/voyage-4-large | Voyage AI | 1024 | 32,000 | Yes |
voyage/voyage-4 | Voyage AI | 1024 | 32,000 | Yes |
voyage/voyage-4-lite | Voyage AI | 1024 | 32,000 | Yes |
dragonkue/bge-m3-ko | HuggingFace | 1024 | 8,192 | No |
jinaai/jina-embeddings-v3 | HuggingFace | 1024 | 8,194 | Yes |
sbintuitions/sarashina-embedding-v2-1b | HuggingFace | 1792 | 8,192 | No |
schift-embed-1-small is an auto-routed alias. Schift selects the best underlying model based on the input language and passes the result through the canonical projection layer. The resolved model is returned in the model field of every response.
Note: Internal backend aliases such as
@cf/qwen/qwen3-embedding-0.6bmay be returned as the resolved model whenschift-embed-1-smallis used.
Task types
Section titled “Task types”Embedding endpoints optionally accept a task_type that tells Schift how the embedding will be used. The value is passed through to instruct-aware models or converted into an internal prefix for prefix-style models.
task_type | Use case |
|---|---|
retrieval_query | Search query embedding |
retrieval_document | Document embedding |
semantic_similarity | Semantic similarity comparison |
question_answering | Question-answering retrieval |
clustering | Clustering or topic grouping |
classification | Classification |
code_retrieval | Code search |
contradiction | Contradiction or counter-evidence search |
factcheck | Fact-checking evidence search |
POST /v1/embed
Section titled “POST /v1/embed”Embed a single text string.
Request body
Section titled “Request body”| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text | string | Yes | — | The text to embed. |
model | string | No | Org routing config | Model ID from the catalog. |
dimensions | integer | No | Model default | Output dimension. Only supported by models with variable dimensions. |
task_type | string | No | — | Embedding intent. See Task types. |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
embedding | number[] | The embedding vector. |
model | string | The model that was actually used. |
dimensions | integer | The output dimension of the embedding. |
usage.tokens | integer | Number of tokens consumed. |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embed \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "text": "Schift enables seamless embedding model migration without re-embedding your data.", "model": "openai/text-embedding-3-large" }'Example response
Section titled “Example response”{ "embedding": [-0.01225, 0.00207, 0.03060], "model": "openai/text-embedding-3-large", "dimensions": 3072, "usage": {"tokens": 14}}POST /v1/embed/batch
Section titled “POST /v1/embed/batch”Synchronous batch embedding. Processes up to 100 texts in a single request. For larger inputs, use POST /v1/embed/jobs.
Request body
Section titled “Request body”| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
texts | string[] | Yes | — | List of texts to embed (max 100). |
model | string | No | Org routing config | Model ID from the catalog. |
dimensions | integer | No | Model default | Output dimension. |
task_type | string | No | — | Embedding intent. |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
embeddings | number[][] | List of embedding vectors, one per input text. |
model | string | The model that was used. |
dimensions | integer | The output dimension. |
usage.tokens | integer | Total tokens across all texts. |
usage.count | integer | Number of texts embedded. |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embed/batch \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "texts": [ "The Mediterranean diet emphasizes fish, olive oil, and vegetables.", "Photosynthesis converts light energy into chemical energy.", "Shakespeare wrote Hamlet and A Midsummer Night'"'"'s Dream." ], "model": "voyage/voyage-4-large" }'Example response
Section titled “Example response”{ "embeddings": [ [-0.01225, 0.00207, 0.03060], [0.00898, -0.00298, 0.01546], [-0.06297, -0.04777, -0.10113] ], "model": "voyage/voyage-4-large", "dimensions": 1024, "usage": {"tokens": 38, "count": 3}}POST /v1/embed/jobs
Section titled “POST /v1/embed/jobs”Create an asynchronous bulk embedding job. Use this endpoint for 101 to 10,000 texts.
Request body
Section titled “Request body”| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
texts | string[] | Yes | — | List of texts to embed (max 10,000). |
model | string | No | Org routing config | Model ID from the catalog. |
dimensions | integer | No | Model default | Output dimension. |
task_type | string | No | — | Embedding intent. |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
id | string | Bulk embed job ID. |
status | string | Initial status (queued). |
model | string | Model selected for the job. |
usage.tokens | integer | Total validated tokens across all texts. |
usage.count | integer | Number of texts submitted. |
chunk_size | integer | Worker chunk size (currently 100). |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embed/jobs \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "texts": ["doc 1", "doc 2", "doc 3"], "model": "schift-embed-1-small", "task_type": "retrieval_document" }'Example response
Section titled “Example response”{ "id": "job_123", "status": "queued", "model": "schift-embed-1-small", "usage": {"tokens": 6, "count": 3}, "chunk_size": 100}GET /v1/embed/jobs/{job_id}
Section titled “GET /v1/embed/jobs/{job_id}”Retrieve the status and metadata of a bulk embed job.
Path parameters
Section titled “Path parameters”| Parameter | Type | Description |
|---|---|---|
job_id | string | The bulk embed job ID. |
Job status lifecycle
Section titled “Job status lifecycle”| Status | Meaning |
|---|---|
queued | Job created and waiting for a worker. |
embedding | Worker is embedding text chunks. |
indexing | Results are being prepared and stored. |
ready | Results are available for retrieval. |
failed | Job failed. |
cancelled | Job was cancelled before processing. |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embed/jobs/job_123 \ -H "Authorization: Bearer $SCHIFT_API_KEY"Example response
Section titled “Example response”{ "id": "job_123", "status": "ready", "org_id": "org_abc", "metadata": { "mode": "embed_bulk", "model": "schift-embed-1-small", "dimensions": null, "task_type": "retrieval_document", "input_count": 3, "chunk_size": 100, "token_count": 6, "completed_count": 3 }}POST /v1/embed/jobs/{job_id}/cancel
Section titled “POST /v1/embed/jobs/{job_id}/cancel”Cancel a queued bulk embed job. Only jobs in queued status can be cancelled.
Path parameters
Section titled “Path parameters”| Parameter | Type | Description |
|---|---|---|
job_id | string | The bulk embed job ID. |
Example request
Section titled “Example request”curl -X POST https://api.schift.io/v1/embed/jobs/job_123/cancel \ -H "Authorization: Bearer $SCHIFT_API_KEY"Example response
Section titled “Example response”{ "status": "cancelled"}GET /v1/embed/jobs/{job_id}/result
Section titled “GET /v1/embed/jobs/{job_id}/result”Retrieve the paginated results of a completed bulk embed job.
Path parameters
Section titled “Path parameters”| Parameter | Type | Description |
|---|---|---|
job_id | string | The bulk embed job ID. |
Query parameters
Section titled “Query parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
offset | integer | 0 | Number of results to skip. |
limit | integer | 100 | Maximum number of results to return (max 1,000). |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
object | string | Always list. |
data | object[] | Paginated embedding result items. |
model | string | Model used for the result. |
dimensions | integer | Output dimension. |
usage.count | integer | Total embedded item count. |
pagination.offset | integer | Requested offset. |
pagination.limit | integer | Requested limit. |
pagination.returned | integer | Number of items returned in this page. |
pagination.total | integer | Total result item count. |
pagination.has_more | boolean | Whether more pages remain. |
Example request
Section titled “Example request”curl "https://api.schift.io/v1/embed/jobs/job_123/result?offset=0&limit=100" \ -H "Authorization: Bearer $SCHIFT_API_KEY"Example response
Section titled “Example response”{ "object": "list", "data": [ {"object": "embedding", "index": 0, "embedding": [-0.01225, 0.00207]}, {"object": "embedding", "index": 1, "embedding": [0.00898, -0.00298]}, {"object": "embedding", "index": 2, "embedding": [-0.06297, -0.04777]} ], "model": "schift-embed-1-small", "dimensions": 1024, "usage": {"count": 3}, "pagination": { "offset": 0, "limit": 100, "returned": 3, "total": 3, "has_more": false }}POST /v1/embed/image
Section titled “POST /v1/embed/image”Embed images by first extracting text with a vision-language model, then embedding the extracted text. Accepts up to 20 base64-encoded images per request.
Request body
Section titled “Request body”| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
images | string[] | Yes | — | Base64-encoded image data. |
model | string | No | Org routing config | Model ID from the catalog. |
dimensions | integer | No | Model default | Output dimension. |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
embeddings | number[][] | List of embedding vectors, one per image. |
model | string | The model that was used. |
dimensions | integer | The output dimension. |
usage.image_count | integer | Number of images processed. |
usage.tokens | integer | Total tokens consumed. |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embed/image \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "images": ["iVBORw0KGgoAAAANSUhEUg..."], "model": "openai/text-embedding-3-large" }'Example response
Section titled “Example response”{ "embeddings": [[-0.01225, 0.00207, 0.03060]], "model": "openai/text-embedding-3-large", "dimensions": 3072, "usage": {"image_count": 1, "tokens": 14}}POST /v1/embeddings
Section titled “POST /v1/embeddings”OpenAI-compatible embeddings endpoint. Accepts the same input shape as OpenAI’s /v1/embeddings and returns an OpenAI-compatible response shape. This endpoint is a thin wrapper over the same logic as /v1/embed and /v1/embed/batch.
Note: For more than 100 inputs, use
POST /v1/embed/jobsinstead.
Request body
Section titled “Request body”| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
input | string | string[] | Yes | — | Text or list of texts to embed. |
model | string | No | Org routing config | Model ID from the catalog. |
dimensions | integer | No | Model default | Output dimension. |
task_type | string | No | — | Schift extension: embedding intent. |
encoding_format | string | No | — | Accepted for compatibility, currently unused. |
user | string | No | — | Accepted for compatibility, currently unused. |
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
object | string | Always list. |
data | object[] | Embedding objects with object, index, and embedding. |
model | string | The model that was used. |
usage.prompt_tokens | integer | Total tokens consumed. |
usage.total_tokens | integer | Total tokens consumed. |
Example request
Section titled “Example request”curl https://api.schift.io/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "input": "Schift enables seamless embedding model migration.", "model": "openai/text-embedding-3-large" }'Example response
Section titled “Example response”{ "object": "list", "data": [ { "object": "embedding", "index": 0, "embedding": [-0.01225, 0.00207, 0.03060] } ], "model": "openai/text-embedding-3-large", "usage": { "prompt_tokens": 8, "total_tokens": 8 }}Canonical projection
Section titled “Canonical projection”Every embedding returned by Schift is projected into a shared canonical latent space. This enables:
- Model switching without re-embedding. Move from OpenAI to Voyage without touching your vector store.
- Cross-model search. Query vectors from one model can retrieve documents embedded with another.
- Dimension reduction. Request any output dimension regardless of the source model’s native dimension.
The projection happens transparently. Call the API normally and Schift handles the rest.
Routing and failover
Section titled “Routing and failover”Your organization can configure a default embedding model and optional fallback. When failover mode is enabled, Schift automatically retries with the fallback model if the primary provider is unavailable.
Use the routing endpoints to configure defaults:
curl -X PUT https://api.schift.io/v1/routing \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $SCHIFT_API_KEY" \ -d '{ "primary": "openai/text-embedding-3-large", "fallback": "voyage/voyage-4-large", "mode": "fallback" }'The response from every embed endpoint includes the model field so you always know which model was used.
Rate limits and quotas
Section titled “Rate limits and quotas”Rate limits are enforced per organization. Usage is tracked in real time. The following axes are checked for embedding endpoints:
| Endpoint | Quota axis |
|---|---|
POST /v1/embed | embed |
POST /v1/embed/batch | embed_batch |
POST /v1/embed/image | embed_image |
POST /v1/embed/jobs | embed_batch |
POST /v1/embeddings | embed or embed_batch |
Retrieve current usage with:
curl https://api.schift.io/v1/usage/summary \ -H "Authorization: Bearer $SCHIFT_API_KEY"Bulk embed jobs are assigned a priority based on your organization’s plan:
| Plan | Priority |
|---|---|
| Enterprise | 0 |
| Business | 1 |
| Pro / Paid / Starter | 2 |
| Free | 3 |
Lower numbers are processed first.
Error codes
Section titled “Error codes”Common errors
Section titled “Common errors”| Status | Meaning |
|---|---|
400 | Bad request — unknown model, invalid dimensions, empty input, or token limit exceeded. |
401 | Invalid or expired API key. |
402 | Insufficient credits or quota exceeded. |
403 | Feature not available on your plan or entitlement limit reached. |
502 | Upstream embedding provider failed (both primary and fallback). |
Bulk job errors
Section titled “Bulk job errors”| Endpoint | Status | Meaning |
|---|---|---|
POST /v1/embed/batch | 400 | More than 100 texts in a sync request. Use POST /v1/embed/jobs. |
POST /v1/embed/jobs | 400 | Empty texts, more than 10,000 texts, or a text exceeds the model’s max_tokens. |
GET /v1/embed/jobs/\{job_id\} | 404 | Job not found or belongs to another organization. |
POST /v1/embed/jobs/\{job_id\}/cancel | 400 | Job is already processing or completed. |
POST /v1/embed/jobs/\{job_id\}/cancel | 409 | Job is no longer queued. |
GET /v1/embed/jobs/\{job_id\}/result | 404 | Job or result not found. |
GET /v1/embed/jobs/\{job_id\}/result | 409 | Result not ready yet. |