Docs
Core Concepts
Schift separates the default Schift Embed 1 path from migration-first vector conversion so each surface stays explicit.
Embedding vectors from different models are not natively compatible. Even when dimensions match, the geometry changes with the model.
Two product surfaces
| Surface | Primary objects | Use it for |
|---|---|---|
| Schift Embed 1 default path | Schift, catalog, embed, db, query, usage | New embedding calls, hosted collections, and canonical-space retrieval |
| Migration-first runtime | Client, Projection, migrate(), adapters | Learning a projection matrix and rewriting stored vectors without raw text |
Projection matrix
textsource vectors (N, src_dim)
-> learned matrix W (src_dim, tgt_dim)
-> projected vectors (N, tgt_dim)Schift learns W from paired examples embedded by both models. At runtime, migration is just a matrix multiply on the existing vectors.
Adapter pipeline
textAdapter(source store)
-> Projection.transform(batch.embeddings)
-> Adapter(sink store)| Property | Typical range |
|---|---|
| Paired samples needed | 500-2,000 |
| Supported migration shapes | 1536->768, 3072->1024, and more |
| Runtime cost | Sub-millisecond per vector |
| Bulk raw text required | No |
Operational note
Projection is the brownfield path. It preserves retrieval quality during provider changes, while Schift Embed 1 is the greenfield default path into the same space.
Temporal Constraints
Every vector can carry an event_time metadata field (epoch milliseconds). The temporal query parameter filters results by time, enabling point-in-time retrieval and time-series search.
| Mode | Behavior | Required Fields |
|---|---|---|
before | event_time < temporal_start | temporal_start |
after | event_time > temporal_start | temporal_start |
between | temporal_start <= event_time <= temporal_end | temporal_start, temporal_end |
as_of | event_time <= temporal_start (snapshot at a point in time) | temporal_start |
latest | Sort by most recent event_time | None |
Vectors without an event_time field are excluded from all temporal queries. The latest mode over-fetches candidates by 20x before sorting to ensure quality results.
pythonfrom schift import Schift
client = Schift(api_key="sch_xxx")
# Documents from the last 24 hours
import time
one_day_ago = int((time.time() - 86400) * 1000)
results = client.collections.search(
"news", "latest AI developments",
temporal="after", temporal_start=one_day_ago,
)
# Snapshot: what was indexed as of Jan 1, 2026
results = client.collections.search(
"contracts", "termination clause",
temporal="as_of", temporal_start=1735689600000,
)Edges & Graph
Schift can use document relationships as part of the server-side retrieval pipeline. New integrations should consume the v2 knowledge search result instead of managing graph edges directly.
| Relation | Use Case |
|---|---|
related_to | General association (default) |
supersedes | Newer version replaces older document |
contradicts | Two documents conflict |
caused_by | Causal chain between events or issues |
is_a | Taxonomy / classification hierarchy |
has_child | Parent-child tree structure |
follows | Sequential ordering (e.g., legal articles) |
Relationship expansion is treated as an internal retrieval concern for new products. The client receives packed context, citations, and warnings from one search call.
bash# Ask through the v2 knowledge-search pipeline
curl -X POST https://api.schift.io/v2/buckets/{bucket_id}/search \
-H "Authorization: Bearer $SCHIFT_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "Which article supersedes article 42?", "top_k": 8}'