Files

Giancarmine Salucci 5a3c27224d chore(FEEDBACK-0001): linting

2026-03-27 02:23:01 +01:00

12 KiB

Raw Blame History

TRUEREF-0020 — Embedding Profiles, Default Local Embeddings, and Version-Scoped Semantic Retrieval

Priority: P1 Status: Pending Depends On: TRUEREF-0007, TRUEREF-0008, TRUEREF-0009, TRUEREF-0010, TRUEREF-0011, TRUEREF-0012, TRUEREF-0014, TRUEREF-0018 Blocks: TRUEREF-0019

Overview

TrueRef already has the main ingredients for embeddings and hybrid search, but the current design is still centered on a single hard-coded provider configuration and does not guarantee version-safe semantic retrieval at query time. This feature formalizes the full provider-registry approach and makes semantic retrieval production-ready for both the REST API and MCP surfaces.

The scope is intentionally narrow:

Introduce first-class embedding profiles so custom AI providers can be registered without hard-coding provider names throughout the API, UI, and runtime.
Enable embeddings by default using the local @xenova/transformers model so a fresh install provides semantic retrieval out of the box.
Make semantic and hybrid retrieval version-scoped, so a query for a specific library and version only searches snippets indexed for that exact version.
Extend the API and MCP query-docs path to use the active embedding profile at query time.

Out of scope:

semantic repository discovery or reranking for libs/search
inferring the repository from the query text
adding multi-tenant provider isolation

Consumers are expected to pass an exact library or repository identifier and the needed version when they want version-specific retrieval.

Problem Statement

Current semantic search support has four structural gaps:

Query-time semantic retrieval is not reliably wired to the configured provider.
The embedding configuration shape is fixed to openai | local | none, which does not scale to custom provider adapters.
Stored embeddings are keyed too narrowly to support multiple profiles or safe provider migration.
The vector search path does not enforce version scoping as strongly as the keyword search path.

That leaves TrueRef in a state where embeddings may be generated at indexing time, but retrieval behavior, provider flexibility, and version guarantees are still weaker than required.

Goals

Make semantic retrieval work by default on a fresh install.
Keep the default self-hosted path fully local.
Support custom AI providers through a provider registry plus profile system.
Keep the API as the source of truth for retrieval behavior.
Keep MCP as a thin compatibility layer over the API.
Guarantee version-scoped hybrid retrieval when a versioned library ID is provided.

Non-Goals

semantic repository search
automatic repo selection from free-text intent
remote provider secrets management beyond current settings persistence model
support for non-embedding rerankers in this ticket

Default Local Embeddings

Embeddings should be enabled by default with the local model path instead of shipping in FTS-only mode.

Default Runtime Behavior

Install @xenova/transformers as a normal runtime dependency rather than treating it as optional for the default setup.
Seed the default embedding profile to the local provider.
Default model: Xenova/all-MiniLM-L6-v2
Default dimensions: 384
New repositories index snippets with embeddings automatically unless the user explicitly disables embeddings.
Query-time retrieval uses hybrid mode automatically when the active profile is healthy.
If the local model cannot be loaded, the system should surface a clear startup or settings error instead of silently pretending semantic search is enabled.

Acceptance Criteria

@xenova/transformers is installed by default for production/runtime use
Fresh installations default to an active local embedding profile
No manual provider configuration is required to get semantic search on a clean setup
The settings UI shows local embeddings as the default active profile
Disabling embeddings remains possible from settings

Embedding Profile Registry

Replace the single enum-style config with a registry-oriented model.

Core Concepts

Provider Adapter

A provider adapter is code registered in the server runtime that knows how to validate config and generate embeddings for one provider kind.

Examples:

local-transformers
openai-compatible
future custom adapters added in code without redesigning the API contract

Embedding Profile

An embedding profile is persisted configuration selecting one provider adapter plus its runtime settings.

interface EmbeddingProfile {
	id: string;
	providerKind: string;
	title: string;
	enabled: boolean;
	isDefault: boolean;
	config: Record<string, unknown>;
	model: string;
	dimensions: number;
	createdAt: number;
	updatedAt: number;
}

Registry Responsibilities

create provider instance from profile
validate profile config
expose provider metadata to the settings API and UI
allow future custom providers without widening TypeScript unions across the app

Acceptance Criteria

Provider selection is no longer hard-coded to openai | local | none
Providers are instantiated through a registry keyed by providerKind
Profiles are stored as first-class records rather than a single settings blob
One profile can be marked as the default active profile for indexing and retrieval
Settings endpoints return profile data and provider metadata cleanly

Data Model Changes

The current snippet_embeddings shape is insufficient for multiple profiles because it allows only one embedding row per snippet.

New Tables / Changes

`embedding_profiles`

embeddingProfiles {
  id: text('id').primaryKey(),
  providerKind: text('provider_kind').notNull(),
  title: text('title').notNull(),
  enabled: integer('enabled', { mode: 'boolean' }).notNull().default(true),
  isDefault: integer('is_default', { mode: 'boolean' }).notNull().default(false),
  model: text('model').notNull(),
  dimensions: integer('dimensions').notNull(),
  config: text('config', { mode: 'json' }).notNull(),
  createdAt: integer('created_at').notNull(),
  updatedAt: integer('updated_at').notNull(),
}

`snippet_embeddings`

Add profile_id and replace the single-row-per-snippet constraint with a composite key or unique index on (snippet_id, profile_id).

snippetEmbeddings {
  snippetId: text('snippet_id').notNull(),
  profileId: text('profile_id').notNull(),
  model: text('model').notNull(),
  dimensions: integer('dimensions').notNull(),
  embedding: blob('embedding').notNull(),
  createdAt: integer('created_at').notNull(),
}

Migration Requirements

migration adds embedding_profiles
migration updates snippet_embeddings for profile scoping
migration seeds a default local profile using Xenova/all-MiniLM-L6-v2
migration safely maps existing single-provider configs into one default profile when upgrading

Query-Time Semantic Retrieval

The API must resolve the active embedding profile at request time instead of baking provider selection into startup-only flows.

API Behavior

GET /api/v1/context

keeps libraryId, query, tokens, and type
adds optional searchMode=auto|keyword|semantic|hybrid
adds optional alpha for hybrid blending
uses the default active embedding profile when searchMode is auto, semantic, or hybrid
falls back to keyword mode only when embeddings are disabled or the caller explicitly requests keyword mode

Version-Scoped Retrieval Rules

when libraryId includes a version, both FTS and vector retrieval must filter to the resolved versionId
re-fetching snippets after ranking must also preserve versionId
default-branch snippets must not bleed into versioned queries
one version's embeddings must not be compared against another version's snippets for the same repository

Acceptance Criteria

/api/v1/context loads the active embedding profile at request time
hybrid retrieval works without restarting the server after profile changes
searchMode is supported for context queries
versioned libraryId queries enforce version filters in both FTS and vector phases
JSON responses can include retrieval metadata such as mode, profile ID, model, and alpha

MCP Surface

MCP should stay thin and inherit semantic behavior from the API.

`query-docs`

Extend the MCP tool schema to support:

searchMode?: 'auto' | 'keyword' | 'semantic' | 'hybrid'
alpha?: number

The MCP server should forward these options directly to /api/v1/context.

Explicitly Out of Scope

semantic reranking for resolve-library-id
automatic library detection from the query text

Acceptance Criteria

MCP query-docs supports the same retrieval mode controls as the API
MCP stdio and HTTP transports both preserve the new options
MCP remains backward compatible when the new fields are omitted

Settings and Profile Management

The existing settings page must evolve from a single provider switcher into profile management for the supported provider kinds.

Required UX Changes

show the default local profile as the initial active profile
allow enabling/disabling embeddings globally
allow creating additional custom profiles for supported provider adapters
allow selecting exactly one default profile
show provider health and profile test results
warn when changing the default profile requires re-embedding to preserve semantic quality

Acceptance Criteria

/settings supports profile-based embedding configuration
users can create an openai-compatible custom profile with arbitrary base URL and model
the local default profile is visible and editable
switching the default profile triggers a re-embedding workflow or explicit warning state

Indexing and Re-Embedding

Indexing must embed snippets against the default active profile, and profile changes must be operationally explicit.

Required Behavior

new indexing jobs use the current default profile
re-indexing stores embeddings under that profile ID
changing the default profile does not silently reuse embeddings from another profile
if a profile is changed in a way that invalidates stored embeddings, affected repositories must be marked as needing re-embedding or re-indexing

Acceptance Criteria

indexing records which profile produced each embedding row
re-embedding can be triggered after default-profile changes
no cross-profile embedding reuse occurs

Test Coverage

migration tests for embedding_profiles and snippet_embeddings
unit tests for provider registry resolution
unit tests for version-scoped vector search
unit tests for hybrid retrieval with explicit searchMode
API tests covering default local profile behavior on fresh setup
MCP tests covering query-docs semantic and hybrid forwarding

Files to Modify

package.json — install @xenova/transformers as a runtime dependency
src/lib/server/db/schema.ts
src/lib/server/db/migrations/*
src/lib/server/embeddings/provider.ts
src/lib/server/embeddings/local.provider.ts
src/lib/server/embeddings/openai.provider.ts
src/lib/server/embeddings/factory.ts or replacement registry module
src/lib/server/embeddings/embedding.service.ts
src/lib/server/search/vector.search.ts
src/lib/server/search/hybrid.search.service.ts
src/routes/api/v1/context/+server.ts
src/routes/api/v1/settings/embedding/+server.ts
src/routes/api/v1/settings/embedding/test/+server.ts
src/routes/settings/+page.svelte
src/mcp/client.ts
src/mcp/tools/query-docs.ts
src/mcp/index.ts

Notes

This ticket intentionally leaves libs/search as keyword-only. The caller is expected to identify the target library and, when needed, pass a version-qualified library ID such as /owner/repo/v1.2.3 before requesting semantic retrieval.

12 KiB Raw Blame History

TRUEREF-0020 — Embedding Profiles, Default Local Embeddings, and Version-Scoped Semantic Retrieval

Overview

Problem Statement

Goals

Non-Goals

Default Local Embeddings

Default Runtime Behavior

Acceptance Criteria

Embedding Profile Registry

Core Concepts

Provider Adapter

Embedding Profile

Registry Responsibilities

Acceptance Criteria

Data Model Changes

New Tables / Changes

embedding_profiles

snippet_embeddings

Migration Requirements

Query-Time Semantic Retrieval

API Behavior

Version-Scoped Retrieval Rules

Acceptance Criteria

MCP Surface

query-docs

Explicitly Out of Scope

Acceptance Criteria

Settings and Profile Management

Required UX Changes

Acceptance Criteria

Indexing and Re-Embedding

Required Behavior

Acceptance Criteria

Test Coverage

Files to Modify

Notes

12 KiB

Raw Blame History

`embedding_profiles`

`snippet_embeddings`

`query-docs`