TRUEREF-0023 rewrite indexing pipeline - parallel reads - serialized writes

2026-04-02 09:49:38 +02:00
parent 9525c58e9a
commit f86be4106b
68 changed files with 5042 additions and 3131 deletions
--- a/docs/features/TRUEREF-0023.md
+++ b/docs/features/TRUEREF-0023.md
@@ -0,0 +1,955 @@
+# TRUEREF-0023 — libSQL Migration, Native Vector Search, Parallel Tag Indexing, and Performance Hardening
+
+**Priority:** P1
+**Status:** Draft
+**Depends On:** TRUEREF-0001, TRUEREF-0022
+**Blocks:** —
+
+---
+
+## Overview
+
+TrueRef currently uses `better-sqlite3` for all database access. This creates three compounding performance problems:
+
+1. **Vector search does not scale.** `VectorSearch.vectorSearch()` loads the entire `snippet_embeddings` table for a repository into Node.js memory and computes cosine similarity in a JavaScript loop. A repository with 100k snippets at 1536 OpenAI dimensions allocates ~600 MB per query and ties up the worker thread for seconds before returning results.
+2. **Missing composite indexes cause table scans on every query.** The schema defines FK columns used in every search and embedding filter, but declares zero composite or covering indexes on them. Every call to `searchSnippets`, `findSnippetIdsMissingEmbeddings`, and `cloneFromAncestor` performs full or near-full table scans.
+3. **SQLite connection is under-configured.** Critical pragmas (`synchronous`, `cache_size`, `mmap_size`, `temp_store`) are absent, leaving significant I/O throughput on the table.
+
+The solution is to replace `better-sqlite3` with `@libsql/better-sqlite3` — an embeddable, drop-in synchronous replacement that is a superset of the better-sqlite3 API and exposes libSQL's native vector index (`libsql_vector_idx`). Because the API is identical, no service layer or ORM code changes are needed beyond import statements and the vector search implementation.
+
+Two additional structural improvements are delivered in the same feature:
+
+4. **Per-repo job serialization is too coarse.** `WorkerPool` prevents any two jobs sharing the same `repositoryId` from running in parallel. This means indexing 200 tags of a single library is fully sequential — one tag at a time — even though different tags write to entirely disjoint row sets. The constraint should track `(repositoryId, versionId)` pairs instead.
+5. **Write lock contention under parallel indexing.** When multiple parse workers flush parsed snippets simultaneously they all compete for the SQLite write lock, spending most of their time in `busy_timeout` back-off. A single dedicated write worker eliminates this: parse workers become pure CPU workers (crawl → parse → send batches over `postMessage`) and the write worker is the sole DB writer.
+6. **Admin UI is unusable under load.** The job queue page has no status or repository filters, no worker status panel, no skeleton loading, uses blocking `alert()` / `confirm()` dialogs, and `IndexingProgress` still polls every 2 seconds instead of consuming the existing SSE stream.
+
+---
+
+## Goals
+
+1. Replace `better-sqlite3` with `@libsql/better-sqlite3` with minimal code churn — import paths only.
+2. Add a libSQL vector index on `snippet_embeddings` so that KNN queries execute inside SQLite instead of in a JavaScript loop.
+3. Add the six composite and covering indexes required by the hot query paths.
+4. Tune the SQLite pragma configuration for I/O performance.
+5. Eliminate the leading cause of OOM risk during semantic search.
+6. Keep a single embedded database file — no external server, no network.
+7. Allow multiple tags of the same repository to index in parallel (unrelated version rows, no write conflict).
+8. Eliminate write-lock contention between parallel parse workers by introducing a single dedicated write worker.
+9. Rebuild the admin jobs page with full filtering (status, repository, free-text), a live worker status panel, skeleton loading on initial fetch, per-action inline spinners, non-blocking toast notifications, and SSE-driven real-time updates throughout.
+
+---
+
+## Non-Goals
+
+- Migrating to the async `@libsql/client` package (HTTP/embedded-replica mode).
+- Changing the Drizzle ORM adapter (`drizzle-orm/better-sqlite3` stays unchanged).
+- Changing `drizzle.config.ts` dialect (`sqlite` is still correct for embedded libSQL).
+- Adding hybrid/approximate indexing beyond the default HNSW strategy provided by `libsql_vector_idx`.
+- Parallelizing embedding batches across providers (separate feature).
+- Horizontally scaling across processes.
+- Allowing more than one job for the exact same `(repositoryId, versionId)` pair to run concurrently (still serialized — duplicate detection in `JobQueue` is unchanged).
+- A full admin authentication system (out of scope).
+- Mobile-responsive redesign of the entire admin section (out of scope).
+
+---
+
+## Problem Detail
+
+### 1. Vector Search — Full Table Scan in JavaScript
+
+**File:** `src/lib/server/search/vector.search.ts`
+
+```typescript
+// Current: no LIMIT, loads ALL embeddings for repo into memory
+const rows = this.db.prepare<unknown[], RawEmbeddingRow>(sql).all(...params);
+
+const scored: VectorSearchResult[] = rows.map((row) => {
+	const embedding = new Float32Array(
+		row.embedding.buffer,
+		row.embedding.byteOffset,
+		row.embedding.byteLength / 4
+	);
+	return { snippetId: row.snippet_id, score: cosineSimilarity(queryEmbedding, embedding) };
+});
+
+return scored.sort((a, b) => b.score - a.score).slice(0, limit);
+```
+
+For a repo with N snippets and D dimensions, this allocates `N × D × 4` bytes per query. At N=100k and D=1536, that is ~600 MB allocated synchronously. The result is sorted entirely in JS before the top-k is returned. With a native vector index, SQLite returns only the top-k rows.
+
+### 2. Missing Composite Indexes
+
+The `snippets`, `documents`, and `snippet_embeddings` tables are queried with multi-column WHERE predicates in every hot path, but no composite indexes exist:
+
+| Table                | Filter columns                | Used in                                        |
+| -------------------- | ----------------------------- | ---------------------------------------------- |
+| `snippets`           | `(repository_id, version_id)` | All search, diff, clone                        |
+| `snippets`           | `(repository_id, type)`       | Type-filtered queries                          |
+| `documents`          | `(repository_id, version_id)` | Diff strategy, clone                           |
+| `snippet_embeddings` | `(profile_id, snippet_id)`    | `findSnippetIdsMissingEmbeddings` LEFT JOIN    |
+| `repositories`       | `(state)`                     | `searchRepositories` WHERE `state = 'indexed'` |
+| `indexing_jobs`      | `(repository_id, status)`     | Job status lookups                             |
+
+Without these indexes, SQLite performs a B-tree scan of the primary key and filters rows in memory. On a 500k-row `snippets` table this is the dominant cost of every search.
+
+### 4. Admin UI — Current Problems
+
+**File:** `src/routes/admin/jobs/+page.svelte`, `src/lib/components/IndexingProgress.svelte`
+
+| Problem                                                        | Location                                  | Impact                                                       |
+| -------------------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------ |
+| `IndexingProgress` polls every 2 s via `setInterval` + `fetch` | `IndexingProgress.svelte`                 | Constant HTTP traffic; progress lags by up to 2 s            |
+| No status or repository filter controls                        | `admin/jobs/+page.svelte`                 | With 200 tag jobs, finding a specific one requires scrolling |
+| No worker status panel                                         | — (no endpoint exists)                    | Operator cannot see which workers are busy or idle           |
+| `alert()` for errors, `confirm()` for cancel                   | `admin/jobs/+page.svelte` — `showToast()` | Blocks the entire browser tab; unusable under parallel jobs  |
+| `actionInProgress` is a single string, not per-job             | `admin/jobs/+page.svelte`                 | Pausing job A disables buttons on all other jobs             |
+| No skeleton loading — blank + spinner on first load            | `admin/jobs/+page.svelte`                 | Layout shift; no structural preview while data loads         |
+| Hard-coded `limit=50` query, no pagination                     | `admin/jobs/+page.svelte:fetchJobs()`     | Page truncates silently for large queues                     |
+
+---
+
+### 3. Under-configured SQLite Connection
+
+**File:** `src/lib/server/db/client.ts` and `src/lib/server/db/index.ts`
+
+Current pragmas:
+
+```typescript
+client.pragma('journal_mode = WAL');
+client.pragma('foreign_keys = ON');
+client.pragma('busy_timeout = 5000');
+```
+
+Missing:
+
+- `synchronous = NORMAL` — halves fsync overhead vs the default FULL; safe with WAL
+- `cache_size = -65536` — 64 MB page cache; default is 2 MB
+- `temp_store = MEMORY` — temp tables and sort spills stay in RAM
+- `mmap_size = 268435456` — 256 MB memory-mapped read path; bypasses system call overhead for reads
+- `wal_autocheckpoint = 1000` — more frequent checkpoints prevent WAL growth
+
+---
+
+## Architecture
+
+### Drop-In Replacement: `@libsql/better-sqlite3`
+
+`@libsql/better-sqlite3` is published by Turso and implemented as a Node.js native addon wrapping the libSQL embedded engine. The exported class is API-compatible with `better-sqlite3`:
+
+```typescript
+// before
+import Database from 'better-sqlite3';
+const db = new Database('/path/to/file.db');
+db.pragma('journal_mode = WAL');
+const rows = db.prepare('SELECT ...').all(...params);
+
+// after — identical code
+import Database from '@libsql/better-sqlite3';
+const db = new Database('/path/to/file.db');
+db.pragma('journal_mode = WAL');
+const rows = db.prepare('SELECT ...').all(...params);
+```
+
+All of the following continue to work unchanged:
+
+- `drizzle-orm/better-sqlite3` adapter and `migrate` helper
+- `drizzle-kit` with `dialect: 'sqlite'`
+- Prepared statements, transactions, WAL pragmas, foreign keys
+- Worker thread per-thread connections (`worker-entry.ts`, `embed-worker-entry.ts`)
+- All `type Database from 'better-sqlite3'` type imports (replaced in lock-step)
+
+### Vector Index Design
+
+libSQL provides `libsql_vector_idx()` — a virtual index type stored in a shadow table alongside the main table. Once indexed, KNN queries use a SQL `vector_top_k()` function:
+
+```sql
+-- KNN: return top-k snippet IDs closest to the query vector
+SELECT snippet_id
+FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?)
+```
+
+`vector_from_float32(blob)` accepts the same raw little-endian Float32 bytes currently stored in the `embedding` blob column. **No data migration is needed** — the existing blob column can be re-indexed with `libsql_vector_idx` pointing at the bytes-stored column.
+
+The index strategy:
+
+1. Add a generated `vec_embedding` column of type `F32_BLOB(dimensions)` to `snippet_embeddings`, populated from the existing `embedding` blob via a migration trigger.
+2. Create the vector index: `CREATE INDEX idx_snippet_embeddings_vec ON snippet_embeddings(vec_embedding) USING libsql_vector_idx(vec_embedding)`.
+3. Rewrite `VectorSearch.vectorSearch()` to use `vector_top_k()` with a two-step join instead of the in-memory loop.
+4. Update `EmbeddingService.embedSnippets()` to write `vec_embedding` on insert.
+
+Dimensions are profile-specific. Because the index is per-column, a separate index is needed per embedding dimensionality. For v1, a single index covering the default profile's dimensions is sufficient; multi-profile KNN can be handled with a `WHERE profile_id = ?` pre-filter on the vector_top_k results.
+
+### Updated Vector Search Query
+
+```typescript
+vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
+    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
+
+    // Encode query vector as raw bytes (same format as stored blobs)
+    const queryBytes = Buffer.from(queryEmbedding.buffer);
+
+    // Use libSQL vector_top_k for ANN — returns ordered (rowid, distance) pairs
+    let sql = `
+        SELECT se.snippet_id,
+               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS score
+        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
+        JOIN snippet_embeddings se ON se.rowid = knn.id
+        JOIN snippets s ON s.id = se.snippet_id
+        WHERE s.repository_id = ?
+          AND se.profile_id = ?
+    `;
+    const params: unknown[] = [queryBytes, queryBytes, limit * 4, repositoryId, profileId];
+
+    if (versionId) {
+        sql += ' AND s.version_id = ?';
+        params.push(versionId);
+    }
+
+    sql += ' ORDER BY score ASC LIMIT ?';
+    params.push(limit);
+
+    return this.db
+        .prepare<unknown[], { snippet_id: string; score: number }>(sql)
+        .all(...params)
+        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.score }));
+}
+```
+
+`vector_distance_cos` returns distance (0 = identical), so `1 - distance` gives a similarity score in [0, 1] matching the existing `VectorSearchResult.score` contract.
+
+---
+
+## Implementation Plan
+
+### Phase 1 — Package Swap (no logic changes)
+
+**Files touched:** `package.json`, all `.ts` files that import `better-sqlite3`
+
+1. In `package.json`:
+   - Remove `"better-sqlite3": "^12.6.2"` from `dependencies`
+   - Add `"@libsql/better-sqlite3": "^0.4.0"` to `dependencies`
+   - Remove `"@types/better-sqlite3": "^7.6.13"` from `devDependencies`
+   - `@libsql/better-sqlite3` ships its own TypeScript declarations
+
+2. Replace all import statements (35 occurrences across 19 files):
+
+   | Old import                                                      | New import                                           |
+   | --------------------------------------------------------------- | ---------------------------------------------------- |
+   | `import Database from 'better-sqlite3'`                         | `import Database from '@libsql/better-sqlite3'`      |
+   | `import type Database from 'better-sqlite3'`                    | `import type Database from '@libsql/better-sqlite3'` |
+   | `import { drizzle } from 'drizzle-orm/better-sqlite3'`          | unchanged                                            |
+   | `import { migrate } from 'drizzle-orm/better-sqlite3/migrator'` | unchanged                                            |
+
+   Affected production files:
+   - `src/lib/server/db/index.ts`
+   - `src/lib/server/db/client.ts`
+   - `src/lib/server/embeddings/embedding.service.ts`
+   - `src/lib/server/pipeline/indexing.pipeline.ts`
+   - `src/lib/server/pipeline/job-queue.ts`
+   - `src/lib/server/pipeline/startup.ts`
+   - `src/lib/server/pipeline/worker-entry.ts`
+   - `src/lib/server/pipeline/embed-worker-entry.ts`
+   - `src/lib/server/pipeline/differential-strategy.ts`
+   - `src/lib/server/search/vector.search.ts`
+   - `src/lib/server/search/hybrid.search.service.ts`
+   - `src/lib/server/search/search.service.ts`
+   - `src/lib/server/services/repository.service.ts`
+   - `src/lib/server/services/version.service.ts`
+   - `src/lib/server/services/embedding-settings.service.ts`
+
+   Affected test files (same mechanical replacement):
+   - `src/routes/api/v1/api-contract.integration.test.ts`
+   - `src/routes/api/v1/sse-and-settings.integration.test.ts`
+   - `src/routes/settings/page.server.test.ts`
+   - `src/lib/server/db/schema.test.ts`
+   - `src/lib/server/embeddings/embedding.service.test.ts`
+   - `src/lib/server/pipeline/indexing.pipeline.test.ts`
+   - `src/lib/server/pipeline/differential-strategy.test.ts`
+   - `src/lib/server/search/search.service.test.ts`
+   - `src/lib/server/search/hybrid.search.service.test.ts`
+   - `src/lib/server/services/repository.service.test.ts`
+   - `src/lib/server/services/version.service.test.ts`
+   - `src/routes/api/v1/settings/embedding/server.test.ts`
+   - `src/routes/api/v1/libs/[id]/index/server.test.ts`
+   - `src/routes/api/v1/libs/[id]/versions/discover/server.test.ts`
+
+3. Run all tests — they should pass with zero logic changes: `npm test`
+
+### Phase 2 — Pragma Hardening
+
+**Files touched:** `src/lib/server/db/client.ts`, `src/lib/server/db/index.ts`
+
+Add the following pragmas to both connection factories (raw client and `initializeDatabase()`):
+
+```typescript
+client.pragma('synchronous = NORMAL');
+client.pragma('cache_size = -65536'); // 64 MB
+client.pragma('temp_store = MEMORY');
+client.pragma('mmap_size = 268435456'); // 256 MB
+client.pragma('wal_autocheckpoint = 1000');
+```
+
+Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) open their own connections — apply the same pragmas there.
+
+### Phase 3 — Composite Indexes (Drizzle migration)
+
+**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL file
+
+Add indexes in `schema.ts` using Drizzle's `index()` helper:
+
+```typescript
+// snippets table
+export const snippets = sqliteTable(
+	'snippets',
+	{
+		/* unchanged */
+	},
+	(t) => [
+		index('idx_snippets_repo_version').on(t.repositoryId, t.versionId),
+		index('idx_snippets_repo_type').on(t.repositoryId, t.type)
+	]
+);
+
+// documents table
+export const documents = sqliteTable(
+	'documents',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_documents_repo_version').on(t.repositoryId, t.versionId)]
+);
+
+// snippet_embeddings table
+export const snippetEmbeddings = sqliteTable(
+	'snippet_embeddings',
+	{
+		/* unchanged */
+	},
+	(table) => [
+		primaryKey({ columns: [table.snippetId, table.profileId] }), // unchanged
+		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
+	]
+);
+
+// repositories table
+export const repositories = sqliteTable(
+	'repositories',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_repositories_state').on(t.state)]
+);
+
+// indexing_jobs table
+export const indexingJobs = sqliteTable(
+	'indexing_jobs',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_jobs_repo_status').on(t.repositoryId, t.status)]
+);
+```
+
+Generate and apply migration: `npm run db:generate && npm run db:migrate`
+
+### Phase 4 — Vector Column and Index (Drizzle migration)
+
+**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL, `src/lib/server/search/vector.search.ts`, `src/lib/server/embeddings/embedding.service.ts`
+
+#### 4a. Schema: add `vec_embedding` column
+
+Add `vec_embedding` to `snippet_embeddings`. Drizzle does not have a `F32_BLOB` column type helper; use a raw SQL column:
+
+```typescript
+import { sql } from 'drizzle-orm';
+import { customType } from 'drizzle-orm/sqlite-core';
+
+const f32Blob = (name: string, dimensions: number) =>
+	customType<{ data: Buffer }>({
+		dataType() {
+			return `F32_BLOB(${dimensions})`;
+		}
+	})(name);
+
+export const snippetEmbeddings = sqliteTable(
+	'snippet_embeddings',
+	{
+		snippetId: text('snippet_id')
+			.notNull()
+			.references(() => snippets.id, { onDelete: 'cascade' }),
+		profileId: text('profile_id')
+			.notNull()
+			.references(() => embeddingProfiles.id, { onDelete: 'cascade' }),
+		model: text('model').notNull(),
+		dimensions: integer('dimensions').notNull(),
+		embedding: blob('embedding').notNull(), // existing blob — kept for backward compat
+		vecEmbedding: f32Blob('vec_embedding', 1536), // libSQL vector column (nullable during migration fill)
+		createdAt: integer('created_at').notNull()
+	},
+	(table) => [
+		primaryKey({ columns: [table.snippetId, table.profileId] }),
+		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
+	]
+);
+```
+
+Because dimensionality is fixed per model, `F32_BLOB(1536)` covers OpenAI `text-embedding-3-small/large`. A follow-up can parameterize this per profile.
+
+#### 4b. Migration SQL: populate `vec_embedding` from existing `embedding` blob and create the vector index
+
+The vector index cannot be expressed in SQL DDL portable across Drizzle — it must be applied in the FTS-style custom SQL file (`src/lib/server/db/fts.sql` or an equivalent `vectors.sql`):
+
+```sql
+-- Backfill vec_embedding from existing raw blob data
+UPDATE snippet_embeddings
+SET vec_embedding = vector_from_float32(embedding)
+WHERE vec_embedding IS NULL AND embedding IS NOT NULL;
+
+-- Create the HNSW vector index (libSQL extension syntax)
+CREATE INDEX IF NOT EXISTS idx_snippet_embeddings_vec
+ON snippet_embeddings(vec_embedding)
+USING libsql_vector_idx(vec_embedding, 'metric=cosine', 'compress_neighbors=float8', 'max_neighbors=20');
+```
+
+Add a call to this SQL in `initializeDatabase()` alongside the existing `fts.sql` execution:
+
+```typescript
+const vectorSql = readFileSync(join(__dirname, 'vectors.sql'), 'utf-8');
+client.exec(vectorSql);
+```
+
+#### 4c. Update `EmbeddingService.embedSnippets()`
+
+When inserting a new embedding, write both the blob and the vec column:
+
+```typescript
+const insert = this.db.prepare<[string, string, string, number, Buffer, Buffer]>(`
+    INSERT OR REPLACE INTO snippet_embeddings
+        (snippet_id, profile_id, model, dimensions, embedding, vec_embedding, created_at)
+    VALUES (?, ?, ?, ?, ?, vector_from_float32(?), unixepoch())
+`);
+
+// inside the transaction:
+insert.run(
+	snippet.id,
+	this.profileId,
+	embedding.model,
+	embedding.dimensions,
+	embeddingBuffer,
+	embeddingBuffer // same bytes — vector_from_float32() interprets them
+);
+```
+
+#### 4d. Rewrite `VectorSearch.vectorSearch()`
+
+Replace the full-scan JS loop with `vector_top_k()`:
+
+```typescript
+vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
+    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
+
+    const queryBytes = Buffer.from(queryEmbedding.buffer);
+    const candidatePool = limit * 4; // over-fetch for post-filter
+
+    let sql = `
+        SELECT se.snippet_id,
+               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS distance
+        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
+        JOIN snippet_embeddings se ON se.rowid = knn.id
+        JOIN snippets s ON s.id = se.snippet_id
+        WHERE s.repository_id = ?
+          AND se.profile_id = ?
+    `;
+    const params: unknown[] = [queryBytes, queryBytes, candidatePool, repositoryId, profileId];
+
+    if (versionId) {
+        sql += ' AND s.version_id = ?';
+        params.push(versionId);
+    }
+
+    sql += ' ORDER BY distance ASC LIMIT ?';
+    params.push(limit);
+
+    return this.db
+        .prepare<unknown[], { snippet_id: string; distance: number }>(sql)
+        .all(...params)
+        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.distance }));
+}
+```
+
+The `score` contract is preserved (1 = identical, 0 = orthogonal). The `cosineSimilarity` helper function is no longer called at runtime but can be kept for unit tests.
+
+### Phase 5 — Per-Job Serialization Key Fix
+
+**Files touched:** `src/lib/server/pipeline/worker-pool.ts`
+
+The current serialization guard uses a bare `repositoryId`:
+
+```typescript
+// current
+private runningRepoIds = new Set<string>();
+// blocks any job whose repositoryId is already in the set
+const jobIdx = this.jobQueue.findIndex((j) => !this.runningRepoIds.has(j.repositoryId));
+```
+
+Different tags of the same repository write to completely disjoint rows (`version_id`-partitioned documents, snippets, and embeddings). The only genuine conflict is two jobs for the same `(repositoryId, versionId)` pair, which `JobQueue.enqueue()` already prevents via the `status IN ('queued', 'running')` deduplication check.
+
+Change the guard to key on the compound pair:
+
+```typescript
+// replace Set<string> with Set<string> keyed on compound pair
+private runningJobKeys = new Set<string>();
+
+private jobKey(repositoryId: string, versionId?: string | null): string {
+    return `${repositoryId}|${versionId ?? ''}`;
+}
+```
+
+Update all four sites that read/write `runningRepoIds`:
+
+| Location                             | Old                                                   | New                                                                                      |
+| ------------------------------------ | ----------------------------------------------------- | ---------------------------------------------------------------------------------------- |
+| `dispatch()` find                    | `!this.runningRepoIds.has(j.repositoryId)`            | `!this.runningJobKeys.has(this.jobKey(j.repositoryId, j.versionId))`                     |
+| `dispatch()` add                     | `this.runningRepoIds.add(job.repositoryId)`           | `this.runningJobKeys.add(this.jobKey(job.repositoryId, job.versionId))`                  |
+| `onWorkerMessage` done/failed delete | `this.runningRepoIds.delete(runningJob.repositoryId)` | `this.runningJobKeys.delete(this.jobKey(runningJob.repositoryId, runningJob.versionId))` |
+| `onWorkerExit` delete                | same                                                  | same                                                                                     |
+
+The `QueuedJob` and `RunningJob` interfaces already carry `versionId` — no type changes needed.
+
+The only serialized case that remains is `versionId = null` (default-branch re-index) paired with itself, which maps to the stable key `"repositoryId|"` — correctly deduplicated.
+
+---
+
+### Phase 6 — Dedicated Write Worker (Single-Writer Pattern)
+
+**Files touched:** `src/lib/server/pipeline/worker-types.ts`, `src/lib/server/pipeline/write-worker-entry.ts` (new), `src/lib/server/pipeline/worker-entry.ts`, `src/lib/server/pipeline/worker-pool.ts`
+
+#### Motivation
+
+With Phase 5 in place, N tags of the same library can index in parallel. Each parse worker currently opens its own DB connection and holds the write lock while storing parsed snippets. Under N concurrent writers, each worker spends the majority of its wall-clock time waiting in `busy_timeout` back-off. The fix is the single-writer pattern: one dedicated write worker owns the only writable DB connection; parse workers become stateless CPU workers that send write batches over `postMessage`.
+
+```
+Parse Worker 1 ──┐ WriteRequest (docs[], snippets[])    ┌── WriteAck
+Parse Worker 2 ──┼─────────────────────────────────────► Write Worker (sole DB writer)
+Parse Worker N ──┘                                       └── single better-sqlite3 connection
+```
+
+#### New message types (`worker-types.ts`)
+
+```typescript
+export interface WriteRequest {
+	type: 'write';
+	jobId: string;
+	documents: SerializedDocument[];
+	snippets: SerializedSnippet[];
+}
+
+export interface WriteAck {
+	type: 'write_ack';
+	jobId: string;
+	documentCount: number;
+	snippetCount: number;
+}
+
+export interface WriteError {
+	type: 'write_error';
+	jobId: string;
+	error: string;
+}
+
+// SerializedDocument / SerializedSnippet mirror the DB column shapes
+// (plain objects, safe to transfer via structured clone)
+```
+
+#### Write worker (`write-worker-entry.ts`)
+
+The write worker:
+
+- Opens its own `Database` connection (WAL mode, all pragmas from Phase 2)
+- Listens for `WriteRequest` messages
+- Wraps each batch in a single transaction
+- Posts `WriteAck` or `WriteError` back to the parent, which forwards the ack to the originating parse worker by `jobId`
+
+```typescript
+import Database from '@libsql/better-sqlite3';
+import { workerData, parentPort } from 'node:worker_threads';
+import type { WriteRequest, WriteAck, WriteError } from './worker-types.js';
+
+const db = new Database((workerData as WorkerInitData).dbPath);
+db.pragma('journal_mode = WAL');
+db.pragma('synchronous = NORMAL');
+db.pragma('cache_size = -65536');
+db.pragma('foreign_keys = ON');
+
+const insertDoc = db.prepare(`INSERT OR REPLACE INTO documents     (...) VALUES (...)`);
+const insertSnippet = db.prepare(`INSERT OR REPLACE INTO snippets      (...) VALUES (...)`);
+
+const writeBatch = db.transaction((req: WriteRequest) => {
+	for (const doc of req.documents) insertDoc.run(doc);
+	for (const snip of req.snippets) insertSnippet.run(snip);
+});
+
+parentPort!.on('message', (req: WriteRequest) => {
+	try {
+		writeBatch(req);
+		const ack: WriteAck = {
+			type: 'write_ack',
+			jobId: req.jobId,
+			documentCount: req.documents.length,
+			snippetCount: req.snippets.length
+		};
+		parentPort!.postMessage(ack);
+	} catch (err) {
+		const fail: WriteError = { type: 'write_error', jobId: req.jobId, error: String(err) };
+		parentPort!.postMessage(fail);
+	}
+});
+```
+
+#### Parse worker changes (`worker-entry.ts`)
+
+Parse workers lose their DB connection. `IndexingPipeline` receives a `sendWrite` callback instead of a `db` instance. After parsing each file batch, the worker calls `sendWrite({ type: 'write', jobId, documents, snippets })` and awaits the `WriteAck` before continuing. This keeps back-pressure: a slow write worker naturally throttles the parse workers without additional semaphores.
+
+#### WorkerPool changes
+
+- Spawn one write worker at startup (always, regardless of embedding config)
+- Route incoming `write_ack` / `write_error` messages to the correct waiting parse worker via a `Map<jobId, resolve>` promise registry
+- The write worker is separate from the embed worker — embed writes (`snippet_embeddings`) can still go through the write worker by adding an `EmbedWriteRequest` message type, or remain in the embed worker since embedding runs after parsing completes (no lock contention with active parse jobs)
+
+#### Conflict analysis with Phase 5
+
+Phases 5 and 6 compose cleanly:
+
+- Phase 5 allows multiple `(repo, versionId)` jobs to run concurrently
+- Phase 6 ensures all those concurrent jobs share a single write path — contention is eliminated by design
+- The write worker is stateless with respect to job identity; it just executes batches in arrival order within a FIFO message queue (Node.js `postMessage` is ordered)
+- The embed worker remains a separate process (it runs after parse completes, so it never overlaps with active parse writes for the same job)
+
+---
+
+### Phase 7 — Admin UI Overhaul
+
+**Files touched:**
+
+- `src/routes/admin/jobs/+page.svelte` — rebuilt
+- `src/routes/api/v1/workers/+server.ts` — new endpoint
+- `src/lib/components/admin/JobStatusBadge.svelte` — extend with spinner variant
+- `src/lib/components/admin/JobSkeleton.svelte` — new
+- `src/lib/components/admin/WorkerStatusPanel.svelte` — new
+- `src/lib/components/admin/Toast.svelte` — new
+- `src/lib/components/IndexingProgress.svelte` — switch to SSE
+
+#### 7a. New API endpoint: `GET /api/v1/workers`
+
+The `WorkerPool` singleton tracks running jobs in `runningJobs: Map<Worker, RunningJob>` and idle workers in `idleWorkers: Worker[]`. Expose this state as a lightweight REST snapshot:
+
+```typescript
+// GET /api/v1/workers
+// Response shape:
+interface WorkersResponse {
+	concurrency: number; // configured max workers
+	active: number; // workers with a running job
+	idle: number; // workers waiting for work
+	workers: WorkerStatus[]; // one entry per spawned parse worker
+}
+
+interface WorkerStatus {
+	index: number; // worker slot (0-based)
+	state: 'idle' | 'running'; // current state
+	jobId: string | null; // null when idle
+	repositoryId: string | null;
+	versionId: string | null;
+}
+```
+
+The route handler calls `getPool().getStatus()` — add a `getStatus(): WorkersResponse` method to `WorkerPool` that reads `runningJobs` and `idleWorkers` without any DB call. This is read-only and runs on the main thread.
+
+The SSE stream at `/api/v1/jobs/stream` should emit a new `worker-status` event type whenever a worker transitions idle ↔ running (on `dispatch()` and job completion). This allows the worker panel to update in real-time without polling the REST endpoint.
+
+#### 7b. `GET /api/v1/jobs` — add `repositoryId` free-text and multi-status filter
+
+The existing endpoint already accepts `repositoryId` (exact match) and `status` (single value). Extend:
+
+- `repositoryId` to also support prefix match (e.g. `?repositoryId=/facebook` returns all `/facebook/*` repos)
+- `status` to accept comma-separated values: `?status=queued,running`
+- `page` and `pageSize` query params (default pageSize=50, max 200) in addition to `limit` for backwards compat
+
+Return `{ jobs, total, page, pageSize }` with `total` always reflecting the unfiltered-by-page count.
+
+#### 7c. New component: `JobSkeleton.svelte`
+
+A set of skeleton rows matching the job table structure. Shown during the initial fetch before any data arrives. Uses Tailwind `animate-pulse`:
+
+```svelte
+<!-- renders N skeleton rows -->
+<script lang="ts">
+	let { rows = 5 }: { rows?: number } = $props();
+</script>
+
+{#each Array(rows) as _, i (i)}
+	<tr>
+		<td class="px-6 py-4">
+			<div class="h-4 w-48 animate-pulse rounded bg-gray-200"></div>
+			<div class="mt-1 h-3 w-24 animate-pulse rounded bg-gray-100"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-5 w-16 animate-pulse rounded-full bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-4 w-20 animate-pulse rounded bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-2 w-32 animate-pulse rounded-full bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-4 w-28 animate-pulse rounded bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4 text-right">
+			<div class="ml-auto h-7 w-20 animate-pulse rounded bg-gray-200"></div>
+		</td>
+	</tr>
+{/each}
+```
+
+#### 7d. New component: `Toast.svelte`
+
+Replaces all `alert()` / `console.log()` calls in the jobs page. Renders a fixed-position stack in the bottom-right corner. Each toast auto-dismisses after 4 seconds and can be manually closed:
+
+```svelte
+<!-- Usage: bind a toasts array and call push({ message, type }) -->
+<script lang="ts">
+	export interface ToastItem {
+		id: string;
+		message: string;
+		type: 'success' | 'error' | 'info';
+	}
+
+	let { toasts = $bindable([]) }: { toasts: ToastItem[] } = $props();
+
+	function dismiss(id: string) {
+		toasts = toasts.filter((t) => t.id !== id);
+	}
+</script>
+
+<div class="fixed right-4 bottom-4 z-50 flex flex-col gap-2">
+	{#each toasts as toast (toast.id)}
+		<!-- color by type, close button, auto-dismiss via onmount timer -->
+	{/each}
+</div>
+```
+
+The jobs page replaces `showToast()` with pushing onto the bound `toasts` array. The `confirm()` for cancel is replaced with an inline confirmation state per job (`pendingCancelId`) that shows "Confirm cancel?" / "Yes" / "No" buttons inside the row.
+
+#### 7e. New component: `WorkerStatusPanel.svelte`
+
+A compact panel displayed above the job table showing the worker pool health. Subscribes to the `worker-status` SSE events and falls back to polling `GET /api/v1/workers` every 5 s on SSE error:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│  Workers  [2 / 4 active]  ████░░░░  50%                 │
+│  Worker 0  ● running  /facebook/react / v18.3.0         │
+│  Worker 1  ● running  /facebook/react / v17.0.2         │
+│  Worker 2  ○ idle                                       │
+│  Worker 3  ○ idle                                       │
+└─────────────────────────────────────────────────────────┘
+```
+
+Each worker row shows: slot index, status dot (animated green pulse for running), repository ID, version tag, and a link to the job row in the table below.
+
+#### 7f. Filter bar on the jobs page
+
+Add a filter strip between the page header and the table:
+
+```
+[ Repository: _______________ ]  [ Status: ▾ all ]  [ 🔍 Apply ]  [ ↺ Reset ]
+```
+
+- **Repository field**: free-text input, matches `repositoryId` prefix (e.g. `/facebook` shows all `/facebook/*`)
+- **Status dropdown**: multi-select checkboxes for `queued`, `running`, `paused`, `cancelled`, `done`, `failed`; default = all
+- Filters are applied client-side against the loaded `jobs` array for instant feedback, and also re-fetched from the API on Apply to get the correct total count
+- Filter state is mirrored to URL search params (`?repo=...&status=...`) so the view is bookmarkable and survives refresh
+
+#### 7g. Per-job action spinner and disabled state
+
+Replace the single `actionInProgress: string | null` with a `Map<string, 'pausing' | 'resuming' | 'cancelling'>`:
+
+```typescript
+let actionInProgress = $state(new Map<string, 'pausing' | 'resuming' | 'cancelling'>());
+```
+
+Each action button shows an inline spinner (small `animate-spin` circle) and is disabled only for that row. Other rows remain fully interactive during the action. On completion the entry is deleted from the map.
+
+#### 7h. `IndexingProgress.svelte` — switch from polling to SSE
+
+The component currently uses `setInterval + fetch` at 2 s. Replace with the per-job SSE stream already available at `/api/v1/jobs/{id}/stream`:
+
+```typescript
+// replace the $effect body
+$effect(() => {
+	job = null;
+	const es = new EventSource(`/api/v1/jobs/${jobId}/stream`);
+
+	es.addEventListener('job-progress', (event) => {
+		const data = JSON.parse(event.data);
+		job = { ...job, ...data };
+	});
+
+	es.addEventListener('job-done', () => {
+		void fetch(`/api/v1/jobs/${jobId}`)
+			.then((r) => r.json())
+			.then((d) => {
+				job = d.job;
+				oncomplete?.();
+			});
+		es.close();
+	});
+
+	es.addEventListener('job-failed', (event) => {
+		const data = JSON.parse(event.data);
+		job = { ...job, status: 'failed', error: data.error };
+		oncomplete?.();
+		es.close();
+	});
+
+	es.onerror = () => {
+		// on SSE failure fall back to a single fetch to get current state
+		es.close();
+		void fetch(`/api/v1/jobs/${jobId}`)
+			.then((r) => r.json())
+			.then((d) => {
+				job = d.job;
+			});
+	};
+
+	return () => es.close();
+});
+```
+
+This reduces network traffic from 1 request/2 s to zero requests during active indexing — updates arrive as server-push events.
+
+#### 7i. Pagination on the jobs page
+
+Replace the hard-coded `?limit=50` fetch with paginated requests:
+
+```typescript
+let currentPage = $state(1);
+const PAGE_SIZE = 50;
+
+async function fetchJobs() {
+	const params = new URLSearchParams({
+		page: String(currentPage),
+		pageSize: String(PAGE_SIZE),
+		...(filterRepo ? { repositoryId: filterRepo } : {}),
+		...(filterStatuses.length ? { status: filterStatuses.join(',') } : {})
+	});
+	const data = await fetch(`/api/v1/jobs?${params}`).then((r) => r.json());
+	jobs = data.jobs;
+	total = data.total;
+}
+```
+
+Render a simple `« Prev  Page N of M  Next »` control below the table, hidden when `total <= PAGE_SIZE`.
+
+---
+
+## Acceptance Criteria
+
+- [ ] `npm install` with `@libsql/better-sqlite3` succeeds; `better-sqlite3` is absent from `node_modules`
+- [ ] All existing unit and integration tests pass after Phase 1 import swap
+- [ ] `npm run db:migrate` applies the composite index migration cleanly against an existing database
+- [ ] `npm run db:migrate` applies the vector column migration cleanly; `sql> SELECT vec_embedding FROM snippet_embeddings LIMIT 1` returns a non-NULL value for any previously-embedded snippet
+- [ ] `GET /api/v1/context?libraryId=...&query=...` with a semantic-mode or hybrid-mode request returns results in ≤ 200 ms on a repository with 50k+ snippets (vs previous multi-second response)
+- [ ] Memory profiled during a /context request shows no allocation spike proportional to repository size
+- [ ] `EXPLAIN QUERY PLAN` on the `snippets` search query shows `SCAN snippets USING INDEX idx_snippets_repo_version` instead of `SCAN snippets`
+- [ ] Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) start and complete an indexing job successfully after the package swap
+- [ ] `drizzle-kit studio` connects and browses the migrated database
+- [ ] Re-indexing a repository after the migration correctly populates `vec_embedding` on all new snippets
+- [ ] `cosineSimilarity` unit tests still pass (function is kept)
+- [ ] Starting two indexing jobs for different tags of the same repository simultaneously results in both jobs reaching `running` state concurrently (not one waiting for the other)
+- [ ] Starting two indexing jobs for the **same** `(repositoryId, versionId)` pair returns the existing job (deduplication unchanged)
+- [ ] With 4 parse workers and 4 concurrent tag jobs, zero `SQLITE_BUSY` errors appear in logs
+- [ ] Write worker is present in the process list during active indexing (`worker_threads` inspector shows `write-worker-entry`)
+- [ ] A `WriteError` from the write worker marks the originating job as `failed` with the error message propagated to the SSE stream
+- [ ] `GET /api/v1/workers` returns a `WorkersResponse` JSON object with correct `active`, `idle`, and `workers[]` fields while jobs are in-flight
+- [ ] The `worker-status` SSE event is emitted by `/api/v1/jobs/stream` whenever a worker transitions state
+- [ ] The admin jobs page shows skeleton rows (not a blank screen) during the initial `fetchJobs()` call
+- [ ] No `alert()` or `confirm()` calls exist in `admin/jobs/+page.svelte` after this change; all notifications go through `Toast.svelte`
+- [ ] Pausing job A while job B is also in progress does not disable job B's action buttons
+- [ ] The status filter multi-select correctly restricts the visible job list; the URL updates to reflect the filter state
+- [ ] The repository prefix filter `?repositoryId=/facebook` returns all jobs whose `repositoryId` starts with `/facebook`
+- [ ] Paginating past page 1 fetches the next batch from the API, not from the client-side array
+- [ ] `IndexingProgress.svelte` has no `setInterval` call; it uses `EventSource` for progress updates
+- [ ] The `WorkerStatusPanel` shows the correct number of running workers live during a multi-tag indexing run
+- [ ] Refreshing the jobs page with `?repo=/facebook/react&status=running` pre-populates the filters and fetches with those params
+
+---
+
+## Migration Safety
+
+### Backward Compatibility
+
+The `embedding` blob column is kept. The `vec_embedding` column is nullable during the backfill window and becomes populated as:
+
+1. The `UPDATE` in `vectors.sql` fills all existing rows on startup
+2. New embeddings populate it at insert time
+
+If `vec_embedding IS NULL` for a row (e.g., a row inserted before the migration runs), the vector index silently omits that row from results. The fallback in `HybridSearchService` to FTS-only mode still applies when no embeddings exist, so degraded-but-correct behavior is preserved.
+
+### Rollback
+
+Rollback before Phase 4 (vector column): remove `@libsql/better-sqlite3`, restore `better-sqlite3`, restore imports. No schema changes have been made.
+
+Rollback after Phase 4: schema now has `vec_embedding` column. Drop the column with a migration reversal and restore imports. The `embedding` blob is intact throughout — no data loss.
+
+### SQLite File Compatibility
+
+libSQL embedded mode reads and writes standard SQLite 3 files. The WAL file, page size, and encoding are unchanged. An existing production database opened with `@libsql/better-sqlite3` is fully readable and writable. The vector index is stored in a shadow table `idx_snippet_embeddings_vec_shadow` which better-sqlite3 would ignore if rolled back (it is a regular table with a special name).
+
+---
+
+## Dependencies
+
+| Package                  | Action                        | Reason                                          |
+| ------------------------ | ----------------------------- | ----------------------------------------------- |
+| `better-sqlite3`         | Remove from `dependencies`    | Replaced                                        |
+| `@types/better-sqlite3`  | Remove from `devDependencies` | `@libsql/better-sqlite3` ships own types        |
+| `@libsql/better-sqlite3` | Add to `dependencies`         | Drop-in libSQL node addon                       |
+| `drizzle-orm`            | No change                     | `better-sqlite3` adapter works unchanged        |
+| `drizzle-kit`            | No change                     | `dialect: 'sqlite'` correct for embedded libSQL |
+
+No new runtime dependencies beyond the package replacement.
+
+---
+
+## Testing Strategy
+
+### Unit Tests
+
+- `src/lib/server/search/vector.search.ts`: add test asserting KNN results are correct for a seeded 3-vector table; verify memory is not proportional to table size (mock `db.prepare` to assert no unbounded `.all()` is called)
+- `src/lib/server/embeddings/embedding.service.ts`: existing tests cover insert round-trips; verify `vec_embedding` column is non-NULL after `embedSnippets()`
+
+### Integration Tests
+
+- `api-contract.integration.test.ts`: existing tests already use `new Database(':memory:')` — these continue to work with `@libsql/better-sqlite3` because the in-memory path is identical
+- Add one test to `api-contract.integration.test.ts`: seed a repository + multiple embeddings, call `/api/v1/context` in semantic mode, assert non-empty results and response time < 500ms on in-memory DB
+
+### UI Tests
+
+- `src/routes/admin/jobs/+page.svelte`: add Vitest browser tests (Playwright) verifying:
+  - Skeleton rows appear before the first fetch resolves (mock `fetch` to delay 200 ms)
+  - Status filter restricts displayed rows; URL param updates
+  - Pausing job A leaves job B's buttons enabled
+  - Toast appears and auto-dismisses on successful pause
+  - Cancel confirm flow shows inline confirmation, not `window.confirm`
+- `src/lib/components/IndexingProgress.svelte`: unit test that no `setInterval` is created; verify `EventSource` is opened with the correct URL
+
+### Performance Regression Gate
+
+Add a benchmark script `scripts/bench-vector-search.mjs` that:
+
+1. Creates an in-memory libSQL database
+2. Seeds 10000 snippet embeddings (random Float32Array, 1536 dims)
+3. Runs 100 `vectorSearch()` calls
+4. Asserts p99 < 50 ms
+
+This gates the CI check on Phase 4 correctness and speed.