12 KiB
TRUEREF-0020 — Embedding Profiles, Default Local Embeddings, and Version-Scoped Semantic Retrieval
Priority: P1 Status: Pending Depends On: TRUEREF-0007, TRUEREF-0008, TRUEREF-0009, TRUEREF-0010, TRUEREF-0011, TRUEREF-0012, TRUEREF-0014, TRUEREF-0018 Blocks: TRUEREF-0019
Overview
TrueRef already has the main ingredients for embeddings and hybrid search, but the current design is still centered on a single hard-coded provider configuration and does not guarantee version-safe semantic retrieval at query time. This feature formalizes the full provider-registry approach and makes semantic retrieval production-ready for both the REST API and MCP surfaces.
The scope is intentionally narrow:
- Introduce first-class embedding profiles so custom AI providers can be registered without hard-coding provider names throughout the API, UI, and runtime.
- Enable embeddings by default using the local
@xenova/transformersmodel so a fresh install provides semantic retrieval out of the box. - Make semantic and hybrid retrieval version-scoped, so a query for a specific library and version only searches snippets indexed for that exact version.
- Extend the API and MCP
query-docspath to use the active embedding profile at query time.
Out of scope:
- semantic repository discovery or reranking for
libs/search - inferring the repository from the query text
- adding multi-tenant provider isolation
Consumers are expected to pass an exact library or repository identifier and the needed version when they want version-specific retrieval.
Problem Statement
Current semantic search support has four structural gaps:
- Query-time semantic retrieval is not reliably wired to the configured provider.
- The embedding configuration shape is fixed to
openai | local | none, which does not scale to custom provider adapters. - Stored embeddings are keyed too narrowly to support multiple profiles or safe provider migration.
- The vector search path does not enforce version scoping as strongly as the keyword search path.
That leaves TrueRef in a state where embeddings may be generated at indexing time, but retrieval behavior, provider flexibility, and version guarantees are still weaker than required.
Goals
- Make semantic retrieval work by default on a fresh install.
- Keep the default self-hosted path fully local.
- Support custom AI providers through a provider registry plus profile system.
- Keep the API as the source of truth for retrieval behavior.
- Keep MCP as a thin compatibility layer over the API.
- Guarantee version-scoped hybrid retrieval when a versioned library ID is provided.
Non-Goals
- semantic repository search
- automatic repo selection from free-text intent
- remote provider secrets management beyond current settings persistence model
- support for non-embedding rerankers in this ticket
Default Local Embeddings
Embeddings should be enabled by default with the local model path instead of shipping in FTS-only mode.
Default Runtime Behavior
- Install
@xenova/transformersas a normal runtime dependency rather than treating it as optional for the default setup. - Seed the default embedding profile to the local provider.
- Default model:
Xenova/all-MiniLM-L6-v2 - Default dimensions:
384 - New repositories index snippets with embeddings automatically unless the user explicitly disables embeddings.
- Query-time retrieval uses hybrid mode automatically when the active profile is healthy.
- If the local model cannot be loaded, the system should surface a clear startup or settings error instead of silently pretending semantic search is enabled.
Acceptance Criteria
@xenova/transformersis installed by default for production/runtime use- Fresh installations default to an active local embedding profile
- No manual provider configuration is required to get semantic search on a clean setup
- The settings UI shows local embeddings as the default active profile
- Disabling embeddings remains possible from settings
Embedding Profile Registry
Replace the single enum-style config with a registry-oriented model.
Core Concepts
Provider Adapter
A provider adapter is code registered in the server runtime that knows how to validate config and generate embeddings for one provider kind.
Examples:
local-transformersopenai-compatible- future custom adapters added in code without redesigning the API contract
Embedding Profile
An embedding profile is persisted configuration selecting one provider adapter plus its runtime settings.
interface EmbeddingProfile {
id: string;
providerKind: string;
title: string;
enabled: boolean;
isDefault: boolean;
config: Record<string, unknown>;
model: string;
dimensions: number;
createdAt: number;
updatedAt: number;
}
Registry Responsibilities
- create provider instance from profile
- validate profile config
- expose provider metadata to the settings API and UI
- allow future custom providers without widening TypeScript unions across the app
Acceptance Criteria
- Provider selection is no longer hard-coded to
openai | local | none - Providers are instantiated through a registry keyed by
providerKind - Profiles are stored as first-class records rather than a single settings blob
- One profile can be marked as the default active profile for indexing and retrieval
- Settings endpoints return profile data and provider metadata cleanly
Data Model Changes
The current snippet_embeddings shape is insufficient for multiple profiles because it allows only one embedding row per snippet.
New Tables / Changes
embedding_profiles
embeddingProfiles {
id: text('id').primaryKey(),
providerKind: text('provider_kind').notNull(),
title: text('title').notNull(),
enabled: integer('enabled', { mode: 'boolean' }).notNull().default(true),
isDefault: integer('is_default', { mode: 'boolean' }).notNull().default(false),
model: text('model').notNull(),
dimensions: integer('dimensions').notNull(),
config: text('config', { mode: 'json' }).notNull(),
createdAt: integer('created_at').notNull(),
updatedAt: integer('updated_at').notNull(),
}
snippet_embeddings
Add profile_id and replace the single-row-per-snippet constraint with a composite key or unique index on (snippet_id, profile_id).
snippetEmbeddings {
snippetId: text('snippet_id').notNull(),
profileId: text('profile_id').notNull(),
model: text('model').notNull(),
dimensions: integer('dimensions').notNull(),
embedding: blob('embedding').notNull(),
createdAt: integer('created_at').notNull(),
}
Migration Requirements
- migration adds
embedding_profiles - migration updates
snippet_embeddingsfor profile scoping - migration seeds a default local profile using
Xenova/all-MiniLM-L6-v2 - migration safely maps existing single-provider configs into one default profile when upgrading
Query-Time Semantic Retrieval
The API must resolve the active embedding profile at request time instead of baking provider selection into startup-only flows.
API Behavior
GET /api/v1/context
- keeps
libraryId,query,tokens, andtype - adds optional
searchMode=auto|keyword|semantic|hybrid - adds optional
alphafor hybrid blending - uses the default active embedding profile when
searchModeisauto,semantic, orhybrid - falls back to keyword mode only when embeddings are disabled or the caller explicitly requests keyword mode
Version-Scoped Retrieval Rules
- when
libraryIdincludes a version, both FTS and vector retrieval must filter to the resolvedversionId - re-fetching snippets after ranking must also preserve
versionId - default-branch snippets must not bleed into versioned queries
- one version's embeddings must not be compared against another version's snippets for the same repository
Acceptance Criteria
/api/v1/contextloads the active embedding profile at request time- hybrid retrieval works without restarting the server after profile changes
searchModeis supported for context queries- versioned
libraryIdqueries enforce version filters in both FTS and vector phases - JSON responses can include retrieval metadata such as mode, profile ID, model, and alpha
MCP Surface
MCP should stay thin and inherit semantic behavior from the API.
query-docs
Extend the MCP tool schema to support:
searchMode?: 'auto' | 'keyword' | 'semantic' | 'hybrid'alpha?: number
The MCP server should forward these options directly to /api/v1/context.
Explicitly Out of Scope
- semantic reranking for
resolve-library-id - automatic library detection from the query text
Acceptance Criteria
- MCP
query-docssupports the same retrieval mode controls as the API - MCP stdio and HTTP transports both preserve the new options
- MCP remains backward compatible when the new fields are omitted
Settings and Profile Management
The existing settings page must evolve from a single provider switcher into profile management for the supported provider kinds.
Required UX Changes
- show the default local profile as the initial active profile
- allow enabling/disabling embeddings globally
- allow creating additional custom profiles for supported provider adapters
- allow selecting exactly one default profile
- show provider health and profile test results
- warn when changing the default profile requires re-embedding to preserve semantic quality
Acceptance Criteria
/settingssupports profile-based embedding configuration- users can create an
openai-compatiblecustom profile with arbitrary base URL and model - the local default profile is visible and editable
- switching the default profile triggers a re-embedding workflow or explicit warning state
Indexing and Re-Embedding
Indexing must embed snippets against the default active profile, and profile changes must be operationally explicit.
Required Behavior
- new indexing jobs use the current default profile
- re-indexing stores embeddings under that profile ID
- changing the default profile does not silently reuse embeddings from another profile
- if a profile is changed in a way that invalidates stored embeddings, affected repositories must be marked as needing re-embedding or re-indexing
Acceptance Criteria
- indexing records which profile produced each embedding row
- re-embedding can be triggered after default-profile changes
- no cross-profile embedding reuse occurs
Test Coverage
- migration tests for
embedding_profilesandsnippet_embeddings - unit tests for provider registry resolution
- unit tests for version-scoped vector search
- unit tests for hybrid retrieval with explicit
searchMode - API tests covering default local profile behavior on fresh setup
- MCP tests covering
query-docssemantic and hybrid forwarding
Files to Modify
package.json— install@xenova/transformersas a runtime dependencysrc/lib/server/db/schema.tssrc/lib/server/db/migrations/*src/lib/server/embeddings/provider.tssrc/lib/server/embeddings/local.provider.tssrc/lib/server/embeddings/openai.provider.tssrc/lib/server/embeddings/factory.tsor replacement registry modulesrc/lib/server/embeddings/embedding.service.tssrc/lib/server/search/vector.search.tssrc/lib/server/search/hybrid.search.service.tssrc/routes/api/v1/context/+server.tssrc/routes/api/v1/settings/embedding/+server.tssrc/routes/api/v1/settings/embedding/test/+server.tssrc/routes/settings/+page.sveltesrc/mcp/client.tssrc/mcp/tools/query-docs.tssrc/mcp/index.ts
Notes
This ticket intentionally leaves libs/search as keyword-only. The caller is expected to identify the target library and, when needed, pass a version-qualified library ID such as /owner/repo/v1.2.3 before requesting semantic retrieval.