Skip to content

Projection

Projection is the process that maps an embedding from a source model’s vector space into Schift’s shared canonical latent space. Every embedding that passes through Schift is projected before it is stored or returned, which lets you change models, mix models, and change output dimensions without re-embedding your data.

Embedding models from different providers produce vectors with different sizes, scales, and semantic geometries. A vector from openai/text-embedding-3-large (3072 dimensions) and a vector from voyage/voyage-4 (1024 dimensions) are not directly comparable, and they are not interchangeable in the same vector store.

Schift solves this by projecting every embedding into a single canonical space. This gives you three practical benefits:

  • Switch models without re-embedding. You can move from one provider to another without rebuilding your vector store.
  • Cross-model retrieval. A query embedded with one model can retrieve documents embedded with a different model.
  • Flexible output dimensions. You can request any supported output dimension regardless of the source model’s native size.

Schift’s projection layer is implemented as an adaptive, mixture-of-experts projector. The flow has two stages:

  1. to_canonical — map a source-model embedding into the canonical space.
  2. from_canonical — map the canonical embedding to the output dimension you requested.

The canonical space has a fixed dimension of 1024. This dimension is large enough to preserve semantic signal across a wide range of source models while staying small enough to keep storage and search costs predictable.

For each source model, Schift keeps a learned base projection matrix W_base. This matrix is trained with ridge regression to map embeddings from the source model’s native space into the 1024-dimensional canonical space. Base matrices are shared across all organizations.

In addition to the shared base matrix, Schift supports domain-specific low-rank corrections called experts. Each expert is a small ΔW matrix that adapts the base projection for a particular domain, corpus, or task.

Experts can be:

  • Global — shared across all organizations for common domains.
  • Organization-specific — trained on a single organization’s corpus.
  • Task-specific — matched to embedding task types such as retrieval_query or classification.

When an embedding is projected, Schift routes to the right mix of experts using the corpus centroid. It computes cosine similarity between the incoming batch centroid and each expert’s centroid, then uses a softmax-weighted blend of the matching expert corrections. If a task_type is provided, task-matched experts are preferred; otherwise the router falls back to universal experts.

For the auto-routed schift-embed-1-small model, the source embedding is already produced in the canonical space, so the projector skips the expert correction step and uses only the base projection.

After an embedding is in canonical form, from_canonical adapts it to the requested output dimension:

  • If the requested dimension is smaller than 1024, the vector is truncated to the leading dimensions and re-normalized.
  • If the requested dimension is larger than 1024, the vector is padded with zeros.
  • If the requested dimension is exactly 1024, the canonical vector is returned unchanged.

This makes it possible to request, for example, 512-dimensional embeddings from a model whose native output is 3072 dimensions.

Projection is transparent to API callers. When you call POST /v1/embed, POST /v1/embed/batch, or submit a bulk embedding job, Schift:

  1. Sends the text to the selected provider.
  2. Receives the raw embedding.
  3. Calls to_canonical with the source model and your organization context.
  4. Calls from_canonical with your requested dimensions.
  5. Returns the final embedding in the response.

The response always includes the model field so you know which source model was actually used, even when you rely on auto-routing or failover.

Note: Projection happens after provider failover. If Schift falls back from your primary model to a backup model, both the primary and fallback embeddings are projected into the same canonical space, so the returned vectors remain compatible.

Base matrices and expert weights are loaded at startup and can be loaded lazily from Schift’s internal projection store. The training pipeline supports:

  • Training a new W_base from paired source and target embeddings.
  • Training organization-specific or global experts as low-rank ΔW corrections.
  • Registering experts with task tags for task-aware routing.

In most cases you do not need to train projection weights yourself. Schift maintains the shared base matrices, and organization-specific experts are created through internal tooling or managed services.