320 lines
12 KiB
Markdown
320 lines
12 KiB
Markdown
# TRUEREF-0020 — Embedding Profiles, Default Local Embeddings, and Version-Scoped Semantic Retrieval
|
|
|
|
**Priority:** P1
|
|
**Status:** Pending
|
|
**Depends On:** TRUEREF-0007, TRUEREF-0008, TRUEREF-0009, TRUEREF-0010, TRUEREF-0011, TRUEREF-0012, TRUEREF-0014, TRUEREF-0018
|
|
**Blocks:** TRUEREF-0019
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
TrueRef already has the main ingredients for embeddings and hybrid search, but the current design is still centered on a single hard-coded provider configuration and does not guarantee version-safe semantic retrieval at query time. This feature formalizes the full provider-registry approach and makes semantic retrieval production-ready for both the REST API and MCP surfaces.
|
|
|
|
The scope is intentionally narrow:
|
|
|
|
1. Introduce first-class embedding profiles so custom AI providers can be registered without hard-coding provider names throughout the API, UI, and runtime.
|
|
2. Enable embeddings by default using the local `@xenova/transformers` model so a fresh install provides semantic retrieval out of the box.
|
|
3. Make semantic and hybrid retrieval version-scoped, so a query for a specific library and version only searches snippets indexed for that exact version.
|
|
4. Extend the API and MCP `query-docs` path to use the active embedding profile at query time.
|
|
|
|
Out of scope:
|
|
|
|
- semantic repository discovery or reranking for `libs/search`
|
|
- inferring the repository from the query text
|
|
- adding multi-tenant provider isolation
|
|
|
|
Consumers are expected to pass an exact library or repository identifier and the needed version when they want version-specific retrieval.
|
|
|
|
---
|
|
|
|
## Problem Statement
|
|
|
|
Current semantic search support has four structural gaps:
|
|
|
|
1. Query-time semantic retrieval is not reliably wired to the configured provider.
|
|
2. The embedding configuration shape is fixed to `openai | local | none`, which does not scale to custom provider adapters.
|
|
3. Stored embeddings are keyed too narrowly to support multiple profiles or safe provider migration.
|
|
4. The vector search path does not enforce version scoping as strongly as the keyword search path.
|
|
|
|
That leaves TrueRef in a state where embeddings may be generated at indexing time, but retrieval behavior, provider flexibility, and version guarantees are still weaker than required.
|
|
|
|
---
|
|
|
|
## Goals
|
|
|
|
- Make semantic retrieval work by default on a fresh install.
|
|
- Keep the default self-hosted path fully local.
|
|
- Support custom AI providers through a provider registry plus profile system.
|
|
- Keep the API as the source of truth for retrieval behavior.
|
|
- Keep MCP as a thin compatibility layer over the API.
|
|
- Guarantee version-scoped hybrid retrieval when a versioned library ID is provided.
|
|
|
|
---
|
|
|
|
## Non-Goals
|
|
|
|
- semantic repository search
|
|
- automatic repo selection from free-text intent
|
|
- remote provider secrets management beyond current settings persistence model
|
|
- support for non-embedding rerankers in this ticket
|
|
|
|
---
|
|
|
|
## Default Local Embeddings
|
|
|
|
Embeddings should be enabled by default with the local model path instead of shipping in FTS-only mode.
|
|
|
|
### Default Runtime Behavior
|
|
|
|
- Install `@xenova/transformers` as a normal runtime dependency rather than treating it as optional for the default setup.
|
|
- Seed the default embedding profile to the local provider.
|
|
- Default model: `Xenova/all-MiniLM-L6-v2`
|
|
- Default dimensions: `384`
|
|
- New repositories index snippets with embeddings automatically unless the user explicitly disables embeddings.
|
|
- Query-time retrieval uses hybrid mode automatically when the active profile is healthy.
|
|
- If the local model cannot be loaded, the system should surface a clear startup or settings error instead of silently pretending semantic search is enabled.
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] `@xenova/transformers` is installed by default for production/runtime use
|
|
- [ ] Fresh installations default to an active local embedding profile
|
|
- [ ] No manual provider configuration is required to get semantic search on a clean setup
|
|
- [ ] The settings UI shows local embeddings as the default active profile
|
|
- [ ] Disabling embeddings remains possible from settings
|
|
|
|
---
|
|
|
|
## Embedding Profile Registry
|
|
|
|
Replace the single enum-style config with a registry-oriented model.
|
|
|
|
### Core Concepts
|
|
|
|
#### Provider Adapter
|
|
|
|
A provider adapter is code registered in the server runtime that knows how to validate config and generate embeddings for one provider kind.
|
|
|
|
Examples:
|
|
|
|
- `local-transformers`
|
|
- `openai-compatible`
|
|
- future custom adapters added in code without redesigning the API contract
|
|
|
|
#### Embedding Profile
|
|
|
|
An embedding profile is persisted configuration selecting one provider adapter plus its runtime settings.
|
|
|
|
```typescript
|
|
interface EmbeddingProfile {
|
|
id: string;
|
|
providerKind: string;
|
|
title: string;
|
|
enabled: boolean;
|
|
isDefault: boolean;
|
|
config: Record<string, unknown>;
|
|
model: string;
|
|
dimensions: number;
|
|
createdAt: number;
|
|
updatedAt: number;
|
|
}
|
|
```
|
|
|
|
### Registry Responsibilities
|
|
|
|
- create provider instance from profile
|
|
- validate profile config
|
|
- expose provider metadata to the settings API and UI
|
|
- allow future custom providers without widening TypeScript unions across the app
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] Provider selection is no longer hard-coded to `openai | local | none`
|
|
- [ ] Providers are instantiated through a registry keyed by `providerKind`
|
|
- [ ] Profiles are stored as first-class records rather than a single settings blob
|
|
- [ ] One profile can be marked as the default active profile for indexing and retrieval
|
|
- [ ] Settings endpoints return profile data and provider metadata cleanly
|
|
|
|
---
|
|
|
|
## Data Model Changes
|
|
|
|
The current `snippet_embeddings` shape is insufficient for multiple profiles because it allows only one embedding row per snippet.
|
|
|
|
### New Tables / Changes
|
|
|
|
#### `embedding_profiles`
|
|
|
|
```typescript
|
|
embeddingProfiles {
|
|
id: text('id').primaryKey(),
|
|
providerKind: text('provider_kind').notNull(),
|
|
title: text('title').notNull(),
|
|
enabled: integer('enabled', { mode: 'boolean' }).notNull().default(true),
|
|
isDefault: integer('is_default', { mode: 'boolean' }).notNull().default(false),
|
|
model: text('model').notNull(),
|
|
dimensions: integer('dimensions').notNull(),
|
|
config: text('config', { mode: 'json' }).notNull(),
|
|
createdAt: integer('created_at').notNull(),
|
|
updatedAt: integer('updated_at').notNull(),
|
|
}
|
|
```
|
|
|
|
#### `snippet_embeddings`
|
|
|
|
Add `profile_id` and replace the single-row-per-snippet constraint with a composite key or unique index on `(snippet_id, profile_id)`.
|
|
|
|
```typescript
|
|
snippetEmbeddings {
|
|
snippetId: text('snippet_id').notNull(),
|
|
profileId: text('profile_id').notNull(),
|
|
model: text('model').notNull(),
|
|
dimensions: integer('dimensions').notNull(),
|
|
embedding: blob('embedding').notNull(),
|
|
createdAt: integer('created_at').notNull(),
|
|
}
|
|
```
|
|
|
|
### Migration Requirements
|
|
|
|
- [ ] migration adds `embedding_profiles`
|
|
- [ ] migration updates `snippet_embeddings` for profile scoping
|
|
- [ ] migration seeds a default local profile using `Xenova/all-MiniLM-L6-v2`
|
|
- [ ] migration safely maps existing single-provider configs into one default profile when upgrading
|
|
|
|
---
|
|
|
|
## Query-Time Semantic Retrieval
|
|
|
|
The API must resolve the active embedding profile at request time instead of baking provider selection into startup-only flows.
|
|
|
|
### API Behavior
|
|
|
|
`GET /api/v1/context`
|
|
|
|
- keeps `libraryId`, `query`, `tokens`, and `type`
|
|
- adds optional `searchMode=auto|keyword|semantic|hybrid`
|
|
- adds optional `alpha` for hybrid blending
|
|
- uses the default active embedding profile when `searchMode` is `auto`, `semantic`, or `hybrid`
|
|
- falls back to keyword mode only when embeddings are disabled or the caller explicitly requests keyword mode
|
|
|
|
### Version-Scoped Retrieval Rules
|
|
|
|
- when `libraryId` includes a version, both FTS and vector retrieval must filter to the resolved `versionId`
|
|
- re-fetching snippets after ranking must also preserve `versionId`
|
|
- default-branch snippets must not bleed into versioned queries
|
|
- one version's embeddings must not be compared against another version's snippets for the same repository
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] `/api/v1/context` loads the active embedding profile at request time
|
|
- [ ] hybrid retrieval works without restarting the server after profile changes
|
|
- [ ] `searchMode` is supported for context queries
|
|
- [ ] versioned `libraryId` queries enforce version filters in both FTS and vector phases
|
|
- [ ] JSON responses can include retrieval metadata such as mode, profile ID, model, and alpha
|
|
|
|
---
|
|
|
|
## MCP Surface
|
|
|
|
MCP should stay thin and inherit semantic behavior from the API.
|
|
|
|
### `query-docs`
|
|
|
|
Extend the MCP tool schema to support:
|
|
|
|
- `searchMode?: 'auto' | 'keyword' | 'semantic' | 'hybrid'`
|
|
- `alpha?: number`
|
|
|
|
The MCP server should forward these options directly to `/api/v1/context`.
|
|
|
|
### Explicitly Out of Scope
|
|
|
|
- semantic reranking for `resolve-library-id`
|
|
- automatic library detection from the query text
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] MCP `query-docs` supports the same retrieval mode controls as the API
|
|
- [ ] MCP stdio and HTTP transports both preserve the new options
|
|
- [ ] MCP remains backward compatible when the new fields are omitted
|
|
|
|
---
|
|
|
|
## Settings and Profile Management
|
|
|
|
The existing settings page must evolve from a single provider switcher into profile management for the supported provider kinds.
|
|
|
|
### Required UX Changes
|
|
|
|
- show the default local profile as the initial active profile
|
|
- allow enabling/disabling embeddings globally
|
|
- allow creating additional custom profiles for supported provider adapters
|
|
- allow selecting exactly one default profile
|
|
- show provider health and profile test results
|
|
- warn when changing the default profile requires re-embedding to preserve semantic quality
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] `/settings` supports profile-based embedding configuration
|
|
- [ ] users can create an `openai-compatible` custom profile with arbitrary base URL and model
|
|
- [ ] the local default profile is visible and editable
|
|
- [ ] switching the default profile triggers a re-embedding workflow or explicit warning state
|
|
|
|
---
|
|
|
|
## Indexing and Re-Embedding
|
|
|
|
Indexing must embed snippets against the default active profile, and profile changes must be operationally explicit.
|
|
|
|
### Required Behavior
|
|
|
|
- new indexing jobs use the current default profile
|
|
- re-indexing stores embeddings under that profile ID
|
|
- changing the default profile does not silently reuse embeddings from another profile
|
|
- if a profile is changed in a way that invalidates stored embeddings, affected repositories must be marked as needing re-embedding or re-indexing
|
|
|
|
### Acceptance Criteria
|
|
|
|
- [ ] indexing records which profile produced each embedding row
|
|
- [ ] re-embedding can be triggered after default-profile changes
|
|
- [ ] no cross-profile embedding reuse occurs
|
|
|
|
---
|
|
|
|
## Test Coverage
|
|
|
|
- [ ] migration tests for `embedding_profiles` and `snippet_embeddings`
|
|
- [ ] unit tests for provider registry resolution
|
|
- [ ] unit tests for version-scoped vector search
|
|
- [ ] unit tests for hybrid retrieval with explicit `searchMode`
|
|
- [ ] API tests covering default local profile behavior on fresh setup
|
|
- [ ] MCP tests covering `query-docs` semantic and hybrid forwarding
|
|
|
|
---
|
|
|
|
## Files to Modify
|
|
|
|
- `package.json` — install `@xenova/transformers` as a runtime dependency
|
|
- `src/lib/server/db/schema.ts`
|
|
- `src/lib/server/db/migrations/*`
|
|
- `src/lib/server/embeddings/provider.ts`
|
|
- `src/lib/server/embeddings/local.provider.ts`
|
|
- `src/lib/server/embeddings/openai.provider.ts`
|
|
- `src/lib/server/embeddings/factory.ts` or replacement registry module
|
|
- `src/lib/server/embeddings/embedding.service.ts`
|
|
- `src/lib/server/search/vector.search.ts`
|
|
- `src/lib/server/search/hybrid.search.service.ts`
|
|
- `src/routes/api/v1/context/+server.ts`
|
|
- `src/routes/api/v1/settings/embedding/+server.ts`
|
|
- `src/routes/api/v1/settings/embedding/test/+server.ts`
|
|
- `src/routes/settings/+page.svelte`
|
|
- `src/mcp/client.ts`
|
|
- `src/mcp/tools/query-docs.ts`
|
|
- `src/mcp/index.ts`
|
|
|
|
---
|
|
|
|
## Notes
|
|
|
|
This ticket intentionally leaves `libs/search` as keyword-only. The caller is expected to identify the target library and, when needed, pass a version-qualified library ID such as `/owner/repo/v1.2.3` before requesting semantic retrieval. |