Docs
Core Concepts
Schift separates the default Schift Embed 1 path from migration-first vector conversion so each surface stays explicit.
Embedding vectors from different models are not natively compatible. Even when dimensions match, the geometry changes with the model.
Two product surfaces
| Surface | Primary objects | Use it for |
|---|---|---|
| Schift Embed 1 default path | Schift, catalog, embed, db, query, usage | New embedding calls, hosted collections, and canonical-space retrieval |
| Migration-first runtime | Client, Projection, migrate(), adapters | Learning a projection matrix and rewriting stored vectors without raw text |
Projection matrix
textsource vectors (N, src_dim)
-> learned matrix W (src_dim, tgt_dim)
-> projected vectors (N, tgt_dim)Schift learns W from paired examples embedded by both models. At runtime, migration is just a matrix multiply on the existing vectors.
Adapter pipeline
textAdapter(source store)
-> Projection.transform(batch.embeddings)
-> Adapter(sink store)| Property | Typical range |
|---|---|
| Paired samples needed | 500-2,000 |
| Supported migration shapes | 1536->768, 3072->1024, and more |
| Runtime cost | Sub-millisecond per vector |
| Bulk raw text required | No |
Operational note
Projection is the brownfield path. It preserves retrieval quality during provider changes, while Schift Embed 1 is the greenfield default path into the same space.
Temporal Constraints
Every vector can carry an event_time metadata field (epoch milliseconds). The temporal query parameter filters results by time, enabling point-in-time retrieval and time-series search.
| Mode | Behavior | Required Fields |
|---|---|---|
before | event_time < temporal_start | temporal_start |
after | event_time > temporal_start | temporal_start |
between | temporal_start <= event_time <= temporal_end | temporal_start, temporal_end |
as_of | event_time <= temporal_start (snapshot at a point in time) | temporal_start |
latest | Sort by most recent event_time | None |
Vectors without an event_time field are excluded from all temporal queries. The latest mode over-fetches candidates by 20x before sorting to ensure quality results.
pythonfrom schift import Schift
client = Schift(api_key="sch_xxx")
# Documents from the last 24 hours
import time
one_day_ago = int((time.time() - 86400) * 1000)
results = client.collections.search(
"news", "latest AI developments",
temporal="after", temporal_start=one_day_ago,
)
# Snapshot: what was indexed as of Jan 1, 2026
results = client.collections.search(
"contracts", "termination clause",
temporal="as_of", temporal_start=1735689600000,
)Edges & Graph
Schift buckets support typed, weighted edges between vectors, turning a flat vector store into a knowledge graph. Edges encode domain relationships like supersedes, contradicts, or caused_by.
| Relation | Use Case |
|---|---|
related_to | General association (default) |
supersedes | Newer version replaces older document |
contradicts | Two documents conflict |
caused_by | Causal chain between events or issues |
is_a | Taxonomy / classification hierarchy |
has_child | Parent-child tree structure |
follows | Sequential ordering (e.g., legal articles) |
Edges are stored in a compact CSR format with WAL-backed durability. Each edge carries a weight (0.0-1.0) and can be queried by direction (outgoing, incoming, or both) and relation type.
bash# Add citation edges between legal documents
curl -X POST https://api.schift.io/v1/buckets/{bucket_id}/edges \
-H "Authorization: Bearer $SCHIFT_API_KEY" \
-H "Content-Type: application/json" \
-d '{"edges": [
{"source": "article-301", "target": "article-42", "relation": "supersedes", "weight": 1.0},
{"source": "ruling-2024-001", "target": "article-301", "relation": "caused_by", "weight": 0.8}
]}'
# Query outgoing edges from a node
curl https://api.schift.io/v1/buckets/{bucket_id}/edges/article-301?direction=outgoing \
-H "Authorization: Bearer $SCHIFT_API_KEY"