Embeddings

Show:

Schift provides a unified embedding API that proxies to multiple providers (OpenAI, Google, Voyage, Cloudflare, HuggingFace, and more) through a single set of endpoints. All embeddings pass through Schift’s Canonical Projection layer, so you can switch models or reduce dimensions without re-embedding your data.

Authentication

All embedding endpoints require a Bearer token in the Authorization header:

Authorization: Bearer sch_xxxxxxxxxxxxxxxxxxxx

API keys are managed in the Schift dashboard under Settings > API Keys.

Supported models

Model	Provider	Default dimension	Max tokens	Variable dimensions
`schift-embed-1-small`	Schift (auto)	1024	8,192	Yes
`openai/text-embedding-3-small`	OpenAI	1536	8,191	Yes
`openai/text-embedding-3-large`	OpenAI	3072	8,191	Yes
`google/gemini-embedding-001`	Google	3072	8,191	Yes
`google/gemini-embedding-002`	Google	3072	8,191	Yes
`voyage/voyage-4-large`	Voyage AI	1024	32,000	Yes
`voyage/voyage-4`	Voyage AI	1024	32,000	Yes
`voyage/voyage-4-lite`	Voyage AI	1024	32,000	Yes
`dragonkue/bge-m3-ko`	HuggingFace	1024	8,192	No
`jinaai/jina-embeddings-v3`	HuggingFace	1024	8,194	Yes
`sbintuitions/sarashina-embedding-v2-1b`	HuggingFace	1792	8,192	No

schift-embed-1-small is an auto-routed alias. Schift selects the best underlying model based on the input language and passes the result through the canonical projection layer. The resolved model is returned in the model field of every response.

Note: Internal backend aliases such as @cf/qwen/qwen3-embedding-0.6b may be returned as the resolved model when schift-embed-1-small is used.

Task types

Embedding endpoints optionally accept a task_type that tells Schift how the embedding will be used. The value is passed through to instruct-aware models or converted into an internal prefix for prefix-style models.

`task_type`	Use case
`retrieval_query`	Search query embedding
`retrieval_document`	Document embedding
`semantic_similarity`	Semantic similarity comparison
`question_answering`	Question-answering retrieval
`clustering`	Clustering or topic grouping
`classification`	Classification
`code_retrieval`	Code search
`contradiction`	Contradiction or counter-evidence search
`factcheck`	Fact-checking evidence search

POST /v1/embed

Embed a single text string.

Request body

Parameter	Type	Required	Default	Description
`text`	string	Yes	—	The text to embed.
`model`	string	No	Org routing config	Model ID from the catalog.
`dimensions`	integer	No	Model default	Output dimension. Only supported by models with variable dimensions.
`task_type`	string	No	—	Embedding intent. See Task types.

Response fields

Field	Type	Description
`embedding`	`number[]`	The embedding vector.
`model`	string	The model that was actually used.
`dimensions`	integer	The output dimension of the embedding.
`usage.tokens`	integer	Number of tokens consumed.

Example request

curl https://api.schift.io/v1/embed \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "text": "Schift enables seamless embedding model migration without re-embedding your data.",
    "model": "openai/text-embedding-3-large"
  }'

Example response

{
  "embedding": [-0.01225, 0.00207, 0.03060],
  "model": "openai/text-embedding-3-large",
  "dimensions": 3072,
  "usage": {"tokens": 14}
}

POST /v1/embed/batch

Synchronous batch embedding. Processes up to 100 texts in a single request. For larger inputs, use POST /v1/embed/jobs.

Request body

Parameter	Type	Required	Default	Description
`texts`	`string[]`	Yes	—	List of texts to embed (max 100).
`model`	string	No	Org routing config	Model ID from the catalog.
`dimensions`	integer	No	Model default	Output dimension.
`task_type`	string	No	—	Embedding intent.

Response fields

Field	Type	Description
`embeddings`	`number[][]`	List of embedding vectors, one per input text.
`model`	string	The model that was used.
`dimensions`	integer	The output dimension.
`usage.tokens`	integer	Total tokens across all texts.
`usage.count`	integer	Number of texts embedded.

Example request

curl https://api.schift.io/v1/embed/batch \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "texts": [
      "The Mediterranean diet emphasizes fish, olive oil, and vegetables.",
      "Photosynthesis converts light energy into chemical energy.",
      "Shakespeare wrote Hamlet and A Midsummer Night'"'"'s Dream."
    ],
    "model": "voyage/voyage-4-large"
  }'

Example response

{
  "embeddings": [
    [-0.01225, 0.00207, 0.03060],
    [0.00898, -0.00298, 0.01546],
    [-0.06297, -0.04777, -0.10113]
  ],
  "model": "voyage/voyage-4-large",
  "dimensions": 1024,
  "usage": {"tokens": 38, "count": 3}
}

POST /v1/embed/jobs

Create an asynchronous bulk embedding job. Use this endpoint for 101 to 10,000 texts.

Request body

Parameter	Type	Required	Default	Description
`texts`	`string[]`	Yes	—	List of texts to embed (max 10,000).
`model`	string	No	Org routing config	Model ID from the catalog.
`dimensions`	integer	No	Model default	Output dimension.
`task_type`	string	No	—	Embedding intent.

Response fields

Field	Type	Description
`id`	string	Bulk embed job ID.
`status`	string	Initial status (`queued`).
`model`	string	Model selected for the job.
`usage.tokens`	integer	Total validated tokens across all texts.
`usage.count`	integer	Number of texts submitted.
`chunk_size`	integer	Worker chunk size (currently 100).

Example request

curl https://api.schift.io/v1/embed/jobs \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "texts": ["doc 1", "doc 2", "doc 3"],
    "model": "schift-embed-1-small",
    "task_type": "retrieval_document"
  }'

Example response

{
  "id": "job_123",
  "status": "queued",
  "model": "schift-embed-1-small",
  "usage": {"tokens": 6, "count": 3},
  "chunk_size": 100
}

GET /v1/embed/jobs/{job_id}

Retrieve the status and metadata of a bulk embed job.

Path parameters

Parameter	Type	Description
`job_id`	string	The bulk embed job ID.

Job status lifecycle

Status	Meaning
`queued`	Job created and waiting for a worker.
`embedding`	Worker is embedding text chunks.
`indexing`	Results are being prepared and stored.
`ready`	Results are available for retrieval.
`failed`	Job failed.
`cancelled`	Job was cancelled before processing.

Example request

curl https://api.schift.io/v1/embed/jobs/job_123 \
  -H "Authorization: Bearer $SCHIFT_API_KEY"

Example response

{
  "id": "job_123",
  "status": "ready",
  "org_id": "org_abc",
  "metadata": {
    "mode": "embed_bulk",
    "model": "schift-embed-1-small",
    "dimensions": null,
    "task_type": "retrieval_document",
    "input_count": 3,
    "chunk_size": 100,
    "token_count": 6,
    "completed_count": 3
  }
}

POST /v1/embed/jobs/{job_id}/cancel

Cancel a queued bulk embed job. Only jobs in queued status can be cancelled.

Path parameters

Parameter	Type	Description
`job_id`	string	The bulk embed job ID.

Example request

curl -X POST https://api.schift.io/v1/embed/jobs/job_123/cancel \
  -H "Authorization: Bearer $SCHIFT_API_KEY"

Example response

{
  "status": "cancelled"
}

GET /v1/embed/jobs/{job_id}/result

Retrieve the paginated results of a completed bulk embed job.

Path parameters

Parameter	Type	Description
`job_id`	string	The bulk embed job ID.

Query parameters

Parameter	Type	Default	Description
`offset`	integer	0	Number of results to skip.
`limit`	integer	100	Maximum number of results to return (max 1,000).

Response fields

Field	Type	Description
`object`	string	Always `list`.
`data`	`object[]`	Paginated embedding result items.
`model`	string	Model used for the result.
`dimensions`	integer	Output dimension.
`usage.count`	integer	Total embedded item count.
`pagination.offset`	integer	Requested offset.
`pagination.limit`	integer	Requested limit.
`pagination.returned`	integer	Number of items returned in this page.
`pagination.total`	integer	Total result item count.
`pagination.has_more`	boolean	Whether more pages remain.

Example request

curl "https://api.schift.io/v1/embed/jobs/job_123/result?offset=0&limit=100" \
  -H "Authorization: Bearer $SCHIFT_API_KEY"

Example response

{
  "object": "list",
  "data": [
    {"object": "embedding", "index": 0, "embedding": [-0.01225, 0.00207]},
    {"object": "embedding", "index": 1, "embedding": [0.00898, -0.00298]},
    {"object": "embedding", "index": 2, "embedding": [-0.06297, -0.04777]}
  ],
  "model": "schift-embed-1-small",
  "dimensions": 1024,
  "usage": {"count": 3},
  "pagination": {
    "offset": 0,
    "limit": 100,
    "returned": 3,
    "total": 3,
    "has_more": false
  }
}

POST /v1/embed/image

Embed images by first extracting text with a vision-language model, then embedding the extracted text. Accepts up to 20 base64-encoded images per request.

Request body

Parameter	Type	Required	Default	Description
`images`	`string[]`	Yes	—	Base64-encoded image data.
`model`	string	No	Org routing config	Model ID from the catalog.
`dimensions`	integer	No	Model default	Output dimension.

Response fields

Field	Type	Description
`embeddings`	`number[][]`	List of embedding vectors, one per image.
`model`	string	The model that was used.
`dimensions`	integer	The output dimension.
`usage.image_count`	integer	Number of images processed.
`usage.tokens`	integer	Total tokens consumed.

Example request

curl https://api.schift.io/v1/embed/image \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "images": ["iVBORw0KGgoAAAANSUhEUg..."],
    "model": "openai/text-embedding-3-large"
  }'

Example response

{
  "embeddings": [[-0.01225, 0.00207, 0.03060]],
  "model": "openai/text-embedding-3-large",
  "dimensions": 3072,
  "usage": {"image_count": 1, "tokens": 14}
}

POST /v1/embeddings

OpenAI-compatible embeddings endpoint. Accepts the same input shape as OpenAI’s /v1/embeddings and returns an OpenAI-compatible response shape. This endpoint is a thin wrapper over the same logic as /v1/embed and /v1/embed/batch.

Note: For more than 100 inputs, use POST /v1/embed/jobs instead.

Request body

Parameter	Type	Required	Default	Description
`input`	`string \| string[]`	Yes	—	Text or list of texts to embed.
`model`	string	No	Org routing config	Model ID from the catalog.
`dimensions`	integer	No	Model default	Output dimension.
`task_type`	string	No	—	Schift extension: embedding intent.
`encoding_format`	string	No	—	Accepted for compatibility, currently unused.
`user`	string	No	—	Accepted for compatibility, currently unused.

Response fields

Field	Type	Description
`object`	string	Always `list`.
`data`	`object[]`	Embedding objects with `object`, `index`, and `embedding`.
`model`	string	The model that was used.
`usage.prompt_tokens`	integer	Total tokens consumed.
`usage.total_tokens`	integer	Total tokens consumed.

Example request

curl https://api.schift.io/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "input": "Schift enables seamless embedding model migration.",
    "model": "openai/text-embedding-3-large"
  }'

Example response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.01225, 0.00207, 0.03060]
    }
  ],
  "model": "openai/text-embedding-3-large",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Canonical projection

Every embedding returned by Schift is projected into a shared canonical latent space. This enables:

Model switching without re-embedding. Move from OpenAI to Voyage without touching your vector store.
Cross-model search. Query vectors from one model can retrieve documents embedded with another.
Dimension reduction. Request any output dimension regardless of the source model’s native dimension.

The projection happens transparently. Call the API normally and Schift handles the rest.

Routing and failover

Your organization can configure a default embedding model and optional fallback. When failover mode is enabled, Schift automatically retries with the fallback model if the primary provider is unavailable.

Use the routing endpoints to configure defaults:

curl -X PUT https://api.schift.io/v1/routing \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "primary": "openai/text-embedding-3-large",
    "fallback": "voyage/voyage-4-large",
    "mode": "fallback"
  }'

The response from every embed endpoint includes the model field so you always know which model was used.

Rate limits and quotas

Rate limits are enforced per organization. Usage is tracked in real time. The following axes are checked for embedding endpoints:

Endpoint	Quota axis
`POST /v1/embed`	`embed`
`POST /v1/embed/batch`	`embed_batch`
`POST /v1/embed/image`	`embed_image`
`POST /v1/embed/jobs`	`embed_batch`
`POST /v1/embeddings`	`embed` or `embed_batch`

Retrieve current usage with:

curl https://api.schift.io/v1/usage/summary \
  -H "Authorization: Bearer $SCHIFT_API_KEY"

Bulk embed jobs are assigned a priority based on your organization’s plan:

Plan	Priority
Enterprise	0
Business	1
Pro / Paid / Starter	2
Free	3

Lower numbers are processed first.

Error codes

Common errors

Status	Meaning
`400`	Bad request — unknown model, invalid dimensions, empty input, or token limit exceeded.
`401`	Invalid or expired API key.
`402`	Insufficient credits or quota exceeded.
`403`	Feature not available on your plan or entitlement limit reached.
`502`	Upstream embedding provider failed (both primary and fallback).

Bulk job errors

Endpoint	Status	Meaning
`POST /v1/embed/batch`	`400`	More than 100 texts in a sync request. Use `POST /v1/embed/jobs`.
`POST /v1/embed/jobs`	`400`	Empty texts, more than 10,000 texts, or a text exceeds the model’s `max_tokens`.
`GET /v1/embed/jobs/\{job_id\}`	`404`	Job not found or belongs to another organization.
`POST /v1/embed/jobs/\{job_id\}/cancel`	`400`	Job is already processing or completed.
`POST /v1/embed/jobs/\{job_id\}/cancel`	`409`	Job is no longer queued.
`GET /v1/embed/jobs/\{job_id\}/result`	`404`	Job or result not found.
`GET /v1/embed/jobs/\{job_id\}/result`	`409`	Result not ready yet.

Embeddings

Authentication

Supported models

Task types

POST /v1/embed

Request body

Response fields

Example request

Example response

POST /v1/embed/batch

Request body

Response fields

Example request

Example response

POST /v1/embed/jobs

Request body

Response fields

Example request

Example response

GET /v1/embed/jobs/{job_id}

Path parameters

Job status lifecycle

Example request

Example response

POST /v1/embed/jobs/{job_id}/cancel

Path parameters

Example request

Example response

GET /v1/embed/jobs/{job_id}/result

Path parameters

Query parameters

Response fields

Example request

Example response

POST /v1/embed/image

Request body

Response fields

Example request

Example response

POST /v1/embeddings

Request body

Response fields

Example request

Example response

Canonical projection

Routing and failover

Rate limits and quotas

Error codes

Common errors

Bulk job errors

See also