TRUEREF-0008 — Hybrid Semantic Search Engine

Priority: P1 Status: Pending Depends On: TRUEREF-0006, TRUEREF-0007 Blocks: TRUEREF-0010 (enhances it)

Overview

Combine FTS5 BM25 keyword search with vector similarity search (cosine similarity over embeddings) using Reciprocal Rank Fusion (RRF) to produce a hybrid ranking that outperforms either approach alone. When embeddings are not available, the system transparently falls back to FTS5-only mode.

Acceptance Criteria

HybridSearchService that coordinates FTS5 and vector search
Cosine similarity search over stored embeddings
Reciprocal Rank Fusion for combining ranked lists
Graceful degradation to FTS5-only when embeddings unavailable
Query embedding generated at search time via the configured provider
Results deduplicated by snippet ID (same snippet may appear in both result sets)
Configurable alpha parameter: weight between FTS5 (0.0) and vector (1.0)
Performance: < 300ms for searches over 100k snippets
Unit tests with mock embedding provider

Architecture

query text
    │
    ├──── FTS5 Search ──────────────┐
    │      (BM25 ranking)           │
    │                               │
    └──── Vector Search ────────────┤
           (cosine similarity)      │
           (embed query first)      │
                                    │
                               RRF Fusion
                                    │
                               Final ranked list

Vector Search Implementation

SQLite does not natively support vector operations, so cosine similarity is computed in JavaScript after loading candidate embeddings:

function cosineSimilarity(a: Float32Array, b: Float32Array): number {
  let dot = 0, normA = 0, normB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    normA += a[i] * a[i];
    normB += b[i] * b[i];
  }
  return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}

async vectorSearch(
  queryEmbedding: Float32Array,
  repositoryId: string,
  limit: number = 50
): Promise<Array<{ snippetId: string; score: number }>> {
  // Load all embeddings for the repository (filtered)
  const rows = this.db.prepare(`
    SELECT se.snippet_id, se.embedding
    FROM snippet_embeddings se
    JOIN snippets s ON s.id = se.snippet_id
    WHERE s.repository_id = ?
  `).all(repositoryId) as { snippet_id: string; embedding: Buffer }[];

  // Compute cosine similarity for each
  const scored = rows.map(row => {
    const embedding = new Float32Array(
      row.embedding.buffer,
      row.embedding.byteOffset,
      row.embedding.byteLength / 4
    );
    return {
      snippetId: row.snippet_id,
      score: cosineSimilarity(queryEmbedding, embedding),
    };
  });

  // Sort descending by score, return top-k
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, limit);
}

Performance note: For repositories with > 50k snippets, pre-filtering by FTS5 candidates before computing cosine similarity is recommended. This is a future optimization — for v1, in-memory computation is acceptable.

Reciprocal Rank Fusion

function reciprocalRankFusion(
  ...rankings: Array<Array<{ id: string; score: number }>>
): Array<{ id: string; rrfScore: number }> {
  const K = 60; // RRF constant (standard value)
  const scores = new Map<string, number>();

  for (const ranking of rankings) {
    ranking.forEach(({ id }, rank) => {
      const current = scores.get(id) ?? 0;
      scores.set(id, current + 1 / (K + rank + 1));
    });
  }

  return Array.from(scores.entries())
    .map(([id, rrfScore]) => ({ id, rrfScore }))
    .sort((a, b) => b.rrfScore - a.rrfScore);
}

Hybrid Search Service

export interface HybridSearchOptions {
  repositoryId: string;
  versionId?: string;
  type?: 'code' | 'info';
  limit?: number;
  alpha?: number;          // 0.0 = FTS5 only, 1.0 = vector only, 0.5 = balanced
}

export class HybridSearchService {
  constructor(
    private db: BetterSQLite3.Database,
    private searchService: SearchService,
    private embeddingProvider: EmbeddingProvider | null,
  ) {}

  async search(
    query: string,
    options: HybridSearchOptions
  ): Promise<SnippetSearchResult[]> {
    const limit = options.limit ?? 20;
    const alpha = options.alpha ?? 0.5;

    // Always run FTS5 search
    const ftsResults = this.searchService.searchSnippets(query, {
      repositoryId: options.repositoryId,
      versionId: options.versionId,
      type: options.type,
      limit: limit * 3, // get more candidates for fusion
    });

    // If no embedding provider or alpha = 0, return FTS5 results directly
    if (!this.embeddingProvider || alpha === 0) {
      return ftsResults.slice(0, limit);
    }

    // Embed the query and run vector search
    const [queryEmbedding] = await this.embeddingProvider.embed([query]);
    const vectorResults = await this.vectorSearch(
      queryEmbedding.values,
      options.repositoryId,
      limit * 3
    );

    // Normalize result lists for RRF
    const ftsRanked = ftsResults.map((r, i) => ({
      id: r.snippet.id,
      score: i,
    }));
    const vecRanked = vectorResults.map((r, i) => ({
      id: r.snippetId,
      score: i,
    }));

    // Apply RRF
    const fused = reciprocalRankFusion(ftsRanked, vecRanked);

    // Fetch full snippet data for top results
    const topIds = fused.slice(0, limit).map(r => r.id);
    return this.fetchSnippetsByIds(topIds, options.repositoryId);
  }
}

Configuration

The hybrid search alpha value can be set per-request or globally via settings:

// Default config stored in settings table under key 'search_config'
export interface SearchConfig {
  alpha: number;           // 0.5 default
  maxResults: number;      // 20 default
  enableHybrid: boolean;   // true if embedding provider is configured
}

Files to Create

src/lib/server/search/hybrid.search.service.ts
src/lib/server/search/vector.search.ts
src/lib/server/search/rrf.ts
src/lib/server/search/hybrid.search.service.test.ts

6.1 KiB Raw Blame History