TRUEREF-0008 — Hybrid Semantic Search Engine

Priority: P1 Status: Pending Depends On: TRUEREF-0006, TRUEREF-0007 Blocks: TRUEREF-0010 (enhances it)

Overview

Combine FTS5 BM25 keyword search with vector similarity search (cosine similarity over embeddings) using Reciprocal Rank Fusion (RRF) to produce a hybrid ranking that outperforms either approach alone. When embeddings are not available, the system transparently falls back to FTS5-only mode.

Acceptance Criteria

HybridSearchService that coordinates FTS5 and vector search
Cosine similarity search over stored embeddings
Reciprocal Rank Fusion for combining ranked lists
Graceful degradation to FTS5-only when embeddings unavailable
Query embedding generated at search time via the configured provider
Results deduplicated by snippet ID (same snippet may appear in both result sets)
Configurable alpha parameter: weight between FTS5 (0.0) and vector (1.0)
Performance: < 300ms for searches over 100k snippets
Unit tests with mock embedding provider

Architecture

query text
    │
    ├──── FTS5 Search ──────────────┐
    │      (BM25 ranking)           │
    │                               │
    └──── Vector Search ────────────┤
           (cosine similarity)      │
           (embed query first)      │
                                    │
                               RRF Fusion
                                    │
                               Final ranked list

Vector Search Implementation

SQLite does not natively support vector operations, so cosine similarity is computed in JavaScript after loading candidate embeddings:

function cosineSimilarity(a: Float32Array, b: Float32Array): number {
  let dot = 0, normA = 0, normB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    normA += a[i] * a[i];
    normB += b[i] * b[i];
  }
  return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}

async vectorSearch(
  queryEmbedding: Float32Array,
  repositoryId: string,
  limit: number = 50
): Promise<Array<{ snippetId: string; score: number }>> {
  // Load all embeddings for the repository (filtered)
  const rows = this.db.prepare(`
    SELECT se.snippet_id, se.embedding
    FROM snippet_embeddings se
    JOIN snippets s ON s.id = se.snippet_id
    WHERE s.repository_id = ?
  `).all(repositoryId) as { snippet_id: string; embedding: Buffer }[];

  // Compute cosine similarity for each
  const scored = rows.map(row => {
    const embedding = new Float32Array(
      row.embedding.buffer,
      row.embedding.byteOffset,
      row.embedding.byteLength / 4
    );
    return {
      snippetId: row.snippet_id,
      score: cosineSimilarity(queryEmbedding, embedding),
    };
  });

  // Sort descending by score, return top-k
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, limit);
}

Performance note: For repositories with > 50k snippets, pre-filtering by FTS5 candidates before computing cosine similarity is recommended. This is a future optimization — for v1, in-memory computation is acceptable.

Reciprocal Rank Fusion

function reciprocalRankFusion(
	...rankings: Array<Array<{ id: string; score: number }>>
): Array<{ id: string; rrfScore: number }> {
	const K = 60; // RRF constant (standard value)
	const scores = new Map<string, number>();

	for (const ranking of rankings) {
		ranking.forEach(({ id }, rank) => {
			const current = scores.get(id) ?? 0;
			scores.set(id, current + 1 / (K + rank + 1));
		});
	}

	return Array.from(scores.entries())
		.map(([id, rrfScore]) => ({ id, rrfScore }))
		.sort((a, b) => b.rrfScore - a.rrfScore);
}

Hybrid Search Service

export interface HybridSearchOptions {
	repositoryId: string;
	versionId?: string;
	type?: 'code' | 'info';
	limit?: number;
	alpha?: number; // 0.0 = FTS5 only, 1.0 = vector only, 0.5 = balanced
}

export class HybridSearchService {
	constructor(
		private db: BetterSQLite3.Database,
		private searchService: SearchService,
		private embeddingProvider: EmbeddingProvider | null
	) {}

	async search(query: string, options: HybridSearchOptions): Promise<SnippetSearchResult[]> {
		const limit = options.limit ?? 20;
		const alpha = options.alpha ?? 0.5;

		// Always run FTS5 search
		const ftsResults = this.searchService.searchSnippets(query, {
			repositoryId: options.repositoryId,
			versionId: options.versionId,
			type: options.type,
			limit: limit * 3 // get more candidates for fusion
		});

		// If no embedding provider or alpha = 0, return FTS5 results directly
		if (!this.embeddingProvider || alpha === 0) {
			return ftsResults.slice(0, limit);
		}

		// Embed the query and run vector search
		const [queryEmbedding] = await this.embeddingProvider.embed([query]);
		const vectorResults = await this.vectorSearch(
			queryEmbedding.values,
			options.repositoryId,
			limit * 3
		);

		// Normalize result lists for RRF
		const ftsRanked = ftsResults.map((r, i) => ({
			id: r.snippet.id,
			score: i
		}));
		const vecRanked = vectorResults.map((r, i) => ({
			id: r.snippetId,
			score: i
		}));

		// Apply RRF
		const fused = reciprocalRankFusion(ftsRanked, vecRanked);

		// Fetch full snippet data for top results
		const topIds = fused.slice(0, limit).map((r) => r.id);
		return this.fetchSnippetsByIds(topIds, options.repositoryId);
	}
}

Configuration

The hybrid search alpha value can be set per-request or globally via settings:

// Default config stored in settings table under key 'search_config'
export interface SearchConfig {
	alpha: number; // 0.5 default
	maxResults: number; // 20 default
	enableHybrid: boolean; // true if embedding provider is configured
}

Files to Create

src/lib/server/search/hybrid.search.service.ts
src/lib/server/search/vector.search.ts
src/lib/server/search/rrf.ts
src/lib/server/search/hybrid.search.service.test.ts

6.0 KiB Raw Blame History