docs: update docs, add new features

2026-03-25 15:11:01 +01:00
parent 59628dd408
commit b9d52405fa
4 changed files with 376 additions and 19 deletions
--- a/docs/features/TRUEREF-0020.md
+++ b/docs/features/TRUEREF-0020.md
@@ -0,0 +1,320 @@
+# TRUEREF-0020 — Embedding Profiles, Default Local Embeddings, and Version-Scoped Semantic Retrieval
+
+**Priority:** P1
+**Status:** Pending
+**Depends On:** TRUEREF-0007, TRUEREF-0008, TRUEREF-0009, TRUEREF-0010, TRUEREF-0011, TRUEREF-0012, TRUEREF-0014, TRUEREF-0018
+**Blocks:** TRUEREF-0019
+
+---
+
+## Overview
+
+TrueRef already has the main ingredients for embeddings and hybrid search, but the current design is still centered on a single hard-coded provider configuration and does not guarantee version-safe semantic retrieval at query time. This feature formalizes the full provider-registry approach and makes semantic retrieval production-ready for both the REST API and MCP surfaces.
+
+The scope is intentionally narrow:
+
+1. Introduce first-class embedding profiles so custom AI providers can be registered without hard-coding provider names throughout the API, UI, and runtime.
+2. Enable embeddings by default using the local `@xenova/transformers` model so a fresh install provides semantic retrieval out of the box.
+3. Make semantic and hybrid retrieval version-scoped, so a query for a specific library and version only searches snippets indexed for that exact version.
+4. Extend the API and MCP `query-docs` path to use the active embedding profile at query time.
+
+Out of scope:
+
+- semantic repository discovery or reranking for `libs/search`
+- inferring the repository from the query text
+- adding multi-tenant provider isolation
+
+Consumers are expected to pass an exact library or repository identifier and the needed version when they want version-specific retrieval.
+
+---
+
+## Problem Statement
+
+Current semantic search support has four structural gaps:
+
+1. Query-time semantic retrieval is not reliably wired to the configured provider.
+2. The embedding configuration shape is fixed to `openai | local | none`, which does not scale to custom provider adapters.
+3. Stored embeddings are keyed too narrowly to support multiple profiles or safe provider migration.
+4. The vector search path does not enforce version scoping as strongly as the keyword search path.
+
+That leaves TrueRef in a state where embeddings may be generated at indexing time, but retrieval behavior, provider flexibility, and version guarantees are still weaker than required.
+
+---
+
+## Goals
+
+- Make semantic retrieval work by default on a fresh install.
+- Keep the default self-hosted path fully local.
+- Support custom AI providers through a provider registry plus profile system.
+- Keep the API as the source of truth for retrieval behavior.
+- Keep MCP as a thin compatibility layer over the API.
+- Guarantee version-scoped hybrid retrieval when a versioned library ID is provided.
+
+---
+
+## Non-Goals
+
+- semantic repository search
+- automatic repo selection from free-text intent
+- remote provider secrets management beyond current settings persistence model
+- support for non-embedding rerankers in this ticket
+
+---
+
+## Default Local Embeddings
+
+Embeddings should be enabled by default with the local model path instead of shipping in FTS-only mode.
+
+### Default Runtime Behavior
+
+- Install `@xenova/transformers` as a normal runtime dependency rather than treating it as optional for the default setup.
+- Seed the default embedding profile to the local provider.
+- Default model: `Xenova/all-MiniLM-L6-v2`
+- Default dimensions: `384`
+- New repositories index snippets with embeddings automatically unless the user explicitly disables embeddings.
+- Query-time retrieval uses hybrid mode automatically when the active profile is healthy.
+- If the local model cannot be loaded, the system should surface a clear startup or settings error instead of silently pretending semantic search is enabled.
+
+### Acceptance Criteria
+
+- [ ] `@xenova/transformers` is installed by default for production/runtime use
+- [ ] Fresh installations default to an active local embedding profile
+- [ ] No manual provider configuration is required to get semantic search on a clean setup
+- [ ] The settings UI shows local embeddings as the default active profile
+- [ ] Disabling embeddings remains possible from settings
+
+---
+
+## Embedding Profile Registry
+
+Replace the single enum-style config with a registry-oriented model.
+
+### Core Concepts
+
+#### Provider Adapter
+
+A provider adapter is code registered in the server runtime that knows how to validate config and generate embeddings for one provider kind.
+
+Examples:
+
+- `local-transformers`
+- `openai-compatible`
+- future custom adapters added in code without redesigning the API contract
+
+#### Embedding Profile
+
+An embedding profile is persisted configuration selecting one provider adapter plus its runtime settings.
+
+```typescript
+interface EmbeddingProfile {
+  id: string;
+  providerKind: string;
+  title: string;
+  enabled: boolean;
+  isDefault: boolean;
+  config: Record<string, unknown>;
+  model: string;
+  dimensions: number;
+  createdAt: number;
+  updatedAt: number;
+}
+```
+
+### Registry Responsibilities
+
+- create provider instance from profile
+- validate profile config
+- expose provider metadata to the settings API and UI
+- allow future custom providers without widening TypeScript unions across the app
+
+### Acceptance Criteria
+
+- [ ] Provider selection is no longer hard-coded to `openai | local | none`
+- [ ] Providers are instantiated through a registry keyed by `providerKind`
+- [ ] Profiles are stored as first-class records rather than a single settings blob
+- [ ] One profile can be marked as the default active profile for indexing and retrieval
+- [ ] Settings endpoints return profile data and provider metadata cleanly
+
+---
+
+## Data Model Changes
+
+The current `snippet_embeddings` shape is insufficient for multiple profiles because it allows only one embedding row per snippet.
+
+### New Tables / Changes
+
+#### `embedding_profiles`
+
+```typescript
+embeddingProfiles {
+  id: text('id').primaryKey(),
+  providerKind: text('provider_kind').notNull(),
+  title: text('title').notNull(),
+  enabled: integer('enabled', { mode: 'boolean' }).notNull().default(true),
+  isDefault: integer('is_default', { mode: 'boolean' }).notNull().default(false),
+  model: text('model').notNull(),
+  dimensions: integer('dimensions').notNull(),
+  config: text('config', { mode: 'json' }).notNull(),
+  createdAt: integer('created_at').notNull(),
+  updatedAt: integer('updated_at').notNull(),
+}
+```
+
+#### `snippet_embeddings`
+
+Add `profile_id` and replace the single-row-per-snippet constraint with a composite key or unique index on `(snippet_id, profile_id)`.
+
+```typescript
+snippetEmbeddings {
+  snippetId: text('snippet_id').notNull(),
+  profileId: text('profile_id').notNull(),
+  model: text('model').notNull(),
+  dimensions: integer('dimensions').notNull(),
+  embedding: blob('embedding').notNull(),
+  createdAt: integer('created_at').notNull(),
+}
+```
+
+### Migration Requirements
+
+- [ ] migration adds `embedding_profiles`
+- [ ] migration updates `snippet_embeddings` for profile scoping
+- [ ] migration seeds a default local profile using `Xenova/all-MiniLM-L6-v2`
+- [ ] migration safely maps existing single-provider configs into one default profile when upgrading
+
+---
+
+## Query-Time Semantic Retrieval
+
+The API must resolve the active embedding profile at request time instead of baking provider selection into startup-only flows.
+
+### API Behavior
+
+`GET /api/v1/context`
+
+- keeps `libraryId`, `query`, `tokens`, and `type`
+- adds optional `searchMode=auto|keyword|semantic|hybrid`
+- adds optional `alpha` for hybrid blending
+- uses the default active embedding profile when `searchMode` is `auto`, `semantic`, or `hybrid`
+- falls back to keyword mode only when embeddings are disabled or the caller explicitly requests keyword mode
+
+### Version-Scoped Retrieval Rules
+
+- when `libraryId` includes a version, both FTS and vector retrieval must filter to the resolved `versionId`
+- re-fetching snippets after ranking must also preserve `versionId`
+- default-branch snippets must not bleed into versioned queries
+- one version's embeddings must not be compared against another version's snippets for the same repository
+
+### Acceptance Criteria
+
+- [ ] `/api/v1/context` loads the active embedding profile at request time
+- [ ] hybrid retrieval works without restarting the server after profile changes
+- [ ] `searchMode` is supported for context queries
+- [ ] versioned `libraryId` queries enforce version filters in both FTS and vector phases
+- [ ] JSON responses can include retrieval metadata such as mode, profile ID, model, and alpha
+
+---
+
+## MCP Surface
+
+MCP should stay thin and inherit semantic behavior from the API.
+
+### `query-docs`
+
+Extend the MCP tool schema to support:
+
+- `searchMode?: 'auto' | 'keyword' | 'semantic' | 'hybrid'`
+- `alpha?: number`
+
+The MCP server should forward these options directly to `/api/v1/context`.
+
+### Explicitly Out of Scope
+
+- semantic reranking for `resolve-library-id`
+- automatic library detection from the query text
+
+### Acceptance Criteria
+
+- [ ] MCP `query-docs` supports the same retrieval mode controls as the API
+- [ ] MCP stdio and HTTP transports both preserve the new options
+- [ ] MCP remains backward compatible when the new fields are omitted
+
+---
+
+## Settings and Profile Management
+
+The existing settings page must evolve from a single provider switcher into profile management for the supported provider kinds.
+
+### Required UX Changes
+
+- show the default local profile as the initial active profile
+- allow enabling/disabling embeddings globally
+- allow creating additional custom profiles for supported provider adapters
+- allow selecting exactly one default profile
+- show provider health and profile test results
+- warn when changing the default profile requires re-embedding to preserve semantic quality
+
+### Acceptance Criteria
+
+- [ ] `/settings` supports profile-based embedding configuration
+- [ ] users can create an `openai-compatible` custom profile with arbitrary base URL and model
+- [ ] the local default profile is visible and editable
+- [ ] switching the default profile triggers a re-embedding workflow or explicit warning state
+
+---
+
+## Indexing and Re-Embedding
+
+Indexing must embed snippets against the default active profile, and profile changes must be operationally explicit.
+
+### Required Behavior
+
+- new indexing jobs use the current default profile
+- re-indexing stores embeddings under that profile ID
+- changing the default profile does not silently reuse embeddings from another profile
+- if a profile is changed in a way that invalidates stored embeddings, affected repositories must be marked as needing re-embedding or re-indexing
+
+### Acceptance Criteria
+
+- [ ] indexing records which profile produced each embedding row
+- [ ] re-embedding can be triggered after default-profile changes
+- [ ] no cross-profile embedding reuse occurs
+
+---
+
+## Test Coverage
+
+- [ ] migration tests for `embedding_profiles` and `snippet_embeddings`
+- [ ] unit tests for provider registry resolution
+- [ ] unit tests for version-scoped vector search
+- [ ] unit tests for hybrid retrieval with explicit `searchMode`
+- [ ] API tests covering default local profile behavior on fresh setup
+- [ ] MCP tests covering `query-docs` semantic and hybrid forwarding
+
+---
+
+## Files to Modify
+
+- `package.json` — install `@xenova/transformers` as a runtime dependency
+- `src/lib/server/db/schema.ts`
+- `src/lib/server/db/migrations/*`
+- `src/lib/server/embeddings/provider.ts`
+- `src/lib/server/embeddings/local.provider.ts`
+- `src/lib/server/embeddings/openai.provider.ts`
+- `src/lib/server/embeddings/factory.ts` or replacement registry module
+- `src/lib/server/embeddings/embedding.service.ts`
+- `src/lib/server/search/vector.search.ts`
+- `src/lib/server/search/hybrid.search.service.ts`
+- `src/routes/api/v1/context/+server.ts`
+- `src/routes/api/v1/settings/embedding/+server.ts`
+- `src/routes/api/v1/settings/embedding/test/+server.ts`
+- `src/routes/settings/+page.svelte`
+- `src/mcp/client.ts`
+- `src/mcp/tools/query-docs.ts`
+- `src/mcp/index.ts`
+
+---
+
+## Notes
+
+This ticket intentionally leaves `libs/search` as keyword-only. The caller is expected to identify the target library and, when needed, pass a version-qualified library ID such as `/owner/repo/v1.2.3` before requesting semantic retrieval.