TRUEREF-0023 rewrite indexing pipeline - parallel reads - serialized writes

2026-04-02 09:49:38 +02:00
parent 9525c58e9a
commit f86be4106b
68 changed files with 5042 additions and 3131 deletions
--- a/docs/FINDINGS.md
+++ b/docs/FINDINGS.md
@@ -335,3 +335,47 @@ Add subsequent research below this section.
 - Risks / follow-ups:
  - Iteration 2 task decomposition must treat the current dirty code files from iterations 0 and 1 as the validation baseline, otherwise the executor will keep rediscovering pre-existing worktree drift instead of new task deltas.
  - The sqlite-vec bootstrap helper and the relational cleanup should be planned as one acceptance unit before any downstream vec0, worker-status, or admin-page tasks, because that is the smallest unit that removes the known broken intermediate state.
+
+### 2026-04-01T00:00:00.000Z — TRUEREF-0023 iteration 3 navbar follow-up planning research
+
+- Task: Plan the accepted follow-up request to add an admin route to the main navbar.
+- Files inspected:
+  - `prompts/TRUEREF-0023/progress.yaml`
+  - `prompts/TRUEREF-0023/iteration_2/review_report.yaml`
+  - `prompts/TRUEREF-0023/prompt.yaml`
+  - `package.json`
+  - `src/routes/+layout.svelte`
+  - `src/routes/admin/jobs/+page.svelte`
+- Findings:
+  - The accepted iteration-2 workspace is green: `review_report.yaml` records passing build, passing tests, and no workspace diagnostics, so this request is a narrow additive follow-up rather than a rework of the sqlite-vec/admin jobs implementation.
+  - The main navbar is defined entirely in `src/routes/+layout.svelte` and already uses base-aware SvelteKit navigation via `resolve as resolveRoute` from `$app/paths` for the existing `Repositories`, `Search`, and `Settings` links.
+  - The existing admin surface already lives at `src/routes/admin/jobs/+page.svelte`, which sets the page title to `Job Queue - TrueRef Admin`; adding a navbar entry can therefore target `/admin/jobs` directly without introducing new routes, loaders, or components.
+  - Repository findings from the earlier lint planning work already confirm the codebase expectation to avoid root-relative internal navigation in SvelteKit pages and components, so the new navbar link should follow the existing `resolveRoute('/...')` anchor pattern.
+  - No dedicated test file currently covers the shared navbar. The appropriate validation for this follow-up remains repository-level `npm run build` and `npm test` after the single layout edit.
+- Risks / follow-ups:
+  - The follow-up navigation request should stay isolated to the shared layout so it does not reopen the accepted sqlite-vec implementation surface.
+  - Build and test validation remain the appropriate regression checks because no dedicated navbar test currently exists.
+
+### 2026-04-01T12:05:23.000Z — TRUEREF-0023 iteration 5 tabs filter and bulk reprocess planning research
+
+- Task: Plan the follow-up repo-detail UI change to filter version rows in the tabs/tags view and add a bulk action that reprocesses all errored tags without adding a new backend endpoint.
+- Files inspected:
+  - `prompts/TRUEREF-0023/progress.yaml`
+  - `prompts/TRUEREF-0023/prompt.yaml`
+  - `prompts/TRUEREF-0023/iteration_2/plan.md`
+  - `prompts/TRUEREF-0023/iteration_2/tasks.yaml`
+  - `src/routes/repos/[id]/+page.svelte`
+  - `src/routes/api/v1/libs/[id]/versions/[tag]/index/+server.ts`
+  - `src/routes/api/v1/api-contract.integration.test.ts`
+  - `package.json`
+- Findings:
+  - The relevant UI surface is entirely in `src/routes/repos/[id]/+page.svelte`; the page already loads `versions`, renders per-version state badges, and exposes per-tag `Index` and `Remove` buttons.
+  - Version states are concretely `pending`, `indexing`, `indexed`, and `error`, and the page already centralizes their labels and color classes in `stateLabels` and `stateColors`.
+  - Existing per-tag reprocessing is implemented by `handleIndexVersion(tag)`, which POSTs to `/api/v1/libs/:id/versions/:tag/index`; the corresponding backend route exists and returns a queued job DTO with status `202`.
+  - No bulk reprocess endpoint exists, so the lowest-risk implementation is a UI-only bulk action that iterates the existing per-tag route.
+  - The page already contains a bounded batching pattern in `handleRegisterSelected()` with `BATCH_SIZE = 5`, which provides a concrete local precedent for bulk tag operations without inventing a new concurrency model.
+  - There is no existing page-component or browser test targeting `src/routes/repos/[id]/+page.svelte`; nearby automated coverage is API-contract focused, so this iteration should rely on `npm run build` and `npm test` regression validation unless a developer discovers an existing Svelte page harness during implementation.
+  - Context7 lookup for Svelte and SvelteKit could not be completed in this environment because the configured API key is invalid; planning therefore relied on installed versions from `package.json` (`svelte` `^5.51.0`, `@sveltejs/kit` `^2.50.2`) and the live page patterns already present in the repository.
+- Risks / follow-ups:
+  - Bulk reprocessing must avoid queuing duplicate jobs for tags already shown as `indexing` or already tracked in `activeVersionJobs`.
+  - Filter state should be implemented as local UI state only and must not disturb the existing `onMount(loadVersions)` fetch path or the SSE job-progress flow.
--- a/docs/features/TRUEREF-0021.md
+++ b/docs/features/TRUEREF-0021.md
@@ -47,8 +47,8 @@ Executed in `IndexingPipeline.run()` before the crawl, when the job has a `versi
   containing shell metacharacters).

 3. **Path partitioning**: The changed-file list is split into `changedPaths` (added + modified
-   + renamed-destination) and `deletedPaths`. `unchangedPaths` is derived as
-   `ancestorFilePaths − changedPaths − deletedPaths`.
+   - renamed-destination) and `deletedPaths`. `unchangedPaths` is derived as
+     `ancestorFilePaths − changedPaths − deletedPaths`.

 4. **Guard**: Returns `null` when no indexed ancestor exists, when the ancestor has no indexed
   documents, or when all files changed (nothing to clone).
@@ -74,18 +74,18 @@ matching files are returned. This minimises GitHub API requests and local I/O.

 ## API Surface Changes

-| Symbol | Location | Change |
-|---|---|---|
-| `buildDifferentialPlan` | `pipeline/differential-strategy.ts` | **New** — async function |
-| `DifferentialPlan` | `pipeline/differential-strategy.ts` | **New** — interface |
-| `findBestAncestorVersion` | `utils/tag-order.ts` | **New** — pure function |
-| `fetchGitHubChangedFiles` | `crawler/github-compare.ts` | **New** — async function |
-| `getChangedFilesBetweenRefs` | `utils/git.ts` | **New** — sync function (uses `execFileSync`) |
-| `ChangedFile` | `crawler/types.ts` | **New** — interface |
-| `CrawlOptions.allowedPaths` | `crawler/types.ts` | **New** — optional field |
-| `IndexingPipeline.crawl()` | `pipeline/indexing.pipeline.ts` | **Modified** — added `allowedPaths` param |
-| `IndexingPipeline.cloneFromAncestor()` | `pipeline/indexing.pipeline.ts` | **New** — private method |
-| `IndexingPipeline.run()` | `pipeline/indexing.pipeline.ts` | **Modified** — Stage 0 added |
+| Symbol                                 | Location                            | Change                                        |
+| -------------------------------------- | ----------------------------------- | --------------------------------------------- |
+| `buildDifferentialPlan`                | `pipeline/differential-strategy.ts` | **New** — async function                      |
+| `DifferentialPlan`                     | `pipeline/differential-strategy.ts` | **New** — interface                           |
+| `findBestAncestorVersion`              | `utils/tag-order.ts`                | **New** — pure function                       |
+| `fetchGitHubChangedFiles`              | `crawler/github-compare.ts`         | **New** — async function                      |
+| `getChangedFilesBetweenRefs`           | `utils/git.ts`                      | **New** — sync function (uses `execFileSync`) |
+| `ChangedFile`                          | `crawler/types.ts`                  | **New** — interface                           |
+| `CrawlOptions.allowedPaths`            | `crawler/types.ts`                  | **New** — optional field                      |
+| `IndexingPipeline.crawl()`             | `pipeline/indexing.pipeline.ts`     | **Modified** — added `allowedPaths` param     |
+| `IndexingPipeline.cloneFromAncestor()` | `pipeline/indexing.pipeline.ts`     | **New** — private method                      |
+| `IndexingPipeline.run()`               | `pipeline/indexing.pipeline.ts`     | **Modified** — Stage 0 added                  |

 ---

--- a/docs/features/TRUEREF-0022.md
+++ b/docs/features/TRUEREF-0022.md
@@ -88,6 +88,7 @@ The UI currently polls `GET /api/v1/jobs?repositoryId=...` every 2 seconds. This
 #### Worker Thread lifecycle

 Each worker is a long-lived `node:worker_threads` `Worker` instance that:
+
 1. Opens its own `better-sqlite3` connection to the same database file.
 2. Listens for `{ type: 'run', jobId }` messages from the main thread.
 3. Runs `IndexingPipeline.run(job)`, emitting `postMessage` progress events at each stage boundary and every N files.
@@ -100,18 +101,18 @@ Manages a pool of `concurrency` workers.

 ```typescript
 interface WorkerPoolOptions {
-  concurrency: number;      // default: Math.max(1, os.cpus().length - 1), capped at 4
-  workerScript: string;     // absolute path to the compiled worker entry
+	concurrency: number; // default: Math.max(1, os.cpus().length - 1), capped at 4
+	workerScript: string; // absolute path to the compiled worker entry
 }

 class WorkerPool {
-  private workers: Worker[];
-  private idle: Worker[];
+	private workers: Worker[];
+	private idle: Worker[];

-  enqueue(jobId: string): void;
-  private dispatch(worker: Worker, jobId: string): void;
-  private onWorkerMessage(msg: WorkerMessage): void;
-  private onWorkerExit(worker: Worker, code: number): void;
+	enqueue(jobId: string): void;
+	private dispatch(worker: Worker, jobId: string): void;
+	private onWorkerMessage(msg: WorkerMessage): void;
+	private onWorkerExit(worker: Worker, code: number): void;
 }
 ```

@@ -120,12 +121,14 @@ Workers are kept alive across jobs. If a worker crashes (non-zero exit), the poo
 #### Parallelism and write contention

 With WAL mode enabled (already the case), SQLite supports:
+
 - **One concurrent writer** (the transaction lock)
 - **Many concurrent readers**

 The `replaceSnippets` transaction for different repositories never contends — they write different rows. The `cloneFromAncestor` operation writes to the same tables but different `version_id` values, so WAL checkpoint logic keeps them non-overlapping at the page level.

 Two jobs on the **same repository** (e.g. `/my-lib/v1.0.0` and `/my-lib/v2.0.0`) can run in parallel because:
+
 - Differential indexing (TRUEREF-0021) ensures `v2.0.0` reads from `v1.0.0`'s already-committed rows.
 - The write transactions for each version touch disjoint `version_id` partitions.

@@ -134,6 +137,7 @@ If write contention still occurs under parallel load, `busy_timeout = 5000` (alr
 #### Concurrency limit per repository

 To prevent a user from queuing 500 tags and overwhelming the worker pool, the pool enforces:
+
 - **Max 1 running job per repository** for the default branch (re-index).
 - **Max `concurrency` total running jobs** across all repositories.
 - Version jobs for the same repository are serialised within the pool (the queue picks the oldest queued version job for a given repo only when no other version job for that repo is running).
@@ -148,15 +152,15 @@ Replace the opaque integer progress with a structured stage model:

 ```typescript
 type IndexingStage =
-  | 'queued'
-  | 'differential'   // computing ancestor diff
-  | 'crawling'       // fetching files from GitHub or local FS
-  | 'cloning'        // cloning unchanged files from ancestor (differential only)
-  | 'parsing'        // parsing files into snippets
-  | 'storing'        // writing documents + snippets to DB
-  | 'embedding'      // generating vector embeddings
-  | 'done'
-  | 'failed';
+	| 'queued'
+	| 'differential' // computing ancestor diff
+	| 'crawling' // fetching files from GitHub or local FS
+	| 'cloning' // cloning unchanged files from ancestor (differential only)
+	| 'parsing' // parsing files into snippets
+	| 'storing' // writing documents + snippets to DB
+	| 'embedding' // generating vector embeddings
+	| 'done'
+	| 'failed';
 ```

 ### Extended Job Schema
@@ -172,22 +176,24 @@ The `progress` column (0–100) is retained for backward compatibility and overa

 ```typescript
 interface ProgressMessage {
-  type: 'progress';
-  jobId: string;
-  stage: IndexingStage;
-  stageDetail?: string;    // human-readable detail for the current stage
-  progress: number;        // 0–100 overall
-  processedFiles: number;
-  totalFiles: number;
+	type: 'progress';
+	jobId: string;
+	stage: IndexingStage;
+	stageDetail?: string; // human-readable detail for the current stage
+	progress: number; // 0–100 overall
+	processedFiles: number;
+	totalFiles: number;
 }
 ```

 Workers emit this message:
+
 - On every stage transition (crawl start, parse start, store start, embed start).
 - Every `PROGRESS_EMIT_EVERY = 10` files during the parse loop.
 - On job completion or failure.

 The main thread receives these messages and does two things:
+
 1. Writes the update to `indexing_jobs` in SQLite (batched — one write per message, not per file).
 2. Pushes the payload to any open SSE channels for that jobId.

@@ -198,6 +204,7 @@ The main thread receives these messages and does two things:
 ### `GET /api/v1/jobs/:id/stream`

 Opens an SSE connection for a specific job. The server:
+
 1. Sends the current job state as the first event immediately (no initial lag).
 2. Pushes `ProgressMessage` events as the worker emits them.
 3. Sends a final `event: done` or `event: failed` event, then closes the connection.
@@ -216,7 +223,7 @@ id: 1
 event: progress
 data: {"stage":"crawling","progress":0,"processedFiles":0,"totalFiles":0}

-id: 2  
+id: 2
 event: progress
 data: {"stage":"parsing","progress":12,"processedFiles":240,"totalFiles":2000}

@@ -281,7 +288,7 @@ Expose via the settings table (key `indexing.concurrency`):

 ```typescript
 interface IndexingSettings {
-  concurrency: number;   // 1–max(cpus-1, 1); default 2
+	concurrency: number; // 1–max(cpus-1, 1); default 2
 }
 ```

@@ -362,13 +369,13 @@ The embedding stage must **not** run inside the same Worker Thread as the crawl/

 ### Why a dedicated embedding worker

-| Concern | Per-parse-worker model | Dedicated embedding worker |
-|---|---|---|
-| Memory | N × ~100 MB (model weights + WASM heap) per worker | 1 × ~100 MB regardless of concurrency |
-| Model warm-up | Paid once per worker spawn; cold starts slow | Paid once at server startup |
-| Batch size | Each worker batches only its own job's snippets | All in-flight jobs queue to one worker → larger batches → higher WASM throughput |
-| Provider migration | Must update every worker | Update one file |
-| API rate limiting | N parallel streams to the same API → N×rate-limit hits | One serial stream, naturally throttled |
+| Concern            | Per-parse-worker model                                 | Dedicated embedding worker                                                       |
+| ------------------ | ------------------------------------------------------ | -------------------------------------------------------------------------------- |
+| Memory             | N × ~100 MB (model weights + WASM heap) per worker     | 1 × ~100 MB regardless of concurrency                                            |
+| Model warm-up      | Paid once per worker spawn; cold starts slow           | Paid once at server startup                                                      |
+| Batch size         | Each worker batches only its own job's snippets        | All in-flight jobs queue to one worker → larger batches → higher WASM throughput |
+| Provider migration | Must update every worker                               | Update one file                                                                  |
+| API rate limiting  | N parallel streams to the same API → N×rate-limit hits | One serial stream, naturally throttled                                           |

 With `Xenova/all-MiniLM-L6-v2`, the WASM model and weight files occupy ~90–120 MB of heap. Running three parse workers with embedded model loading costs ~300–360 MB of resident memory that can never be freed while the server is alive. A dedicated worker keeps that cost fixed at one instance.

@@ -415,6 +422,7 @@ Instead, the existing `findSnippetIdsMissingEmbeddings` query is the handshake:
 5. Main thread routes this to the SSE broadcaster → UI updates the embedding progress slice.

 This means:
+
 - The embedding worker reads snippet text from the DB itself (no IPC serialisation of content).
 - The model is loaded once, stays warm, and processes batches from all repositories in FIFO order.
 - Parse workers are never blocked waiting for embeddings — they complete their job stages and exit immediately.
@@ -424,15 +432,15 @@ This means:
 ```typescript
 // Main → Embedding worker
 type EmbedRequest =
-  | { type: 'embed'; jobId: string; repositoryId: string; versionId: string | null }
-  | { type: 'shutdown' };
+	| { type: 'embed'; jobId: string; repositoryId: string; versionId: string | null }
+	| { type: 'shutdown' };

 // Embedding worker → Main
 type EmbedResponse =
-  | { type: 'embed-progress'; jobId: string; done: number; total: number }
-  | { type: 'embed-done'; jobId: string }
-  | { type: 'embed-failed'; jobId: string; error: string }
-  | { type: 'ready' };  // emitted once after model warm-up completes
+	| { type: 'embed-progress'; jobId: string; done: number; total: number }
+	| { type: 'embed-done'; jobId: string }
+	| { type: 'embed-failed'; jobId: string; error: string }
+	| { type: 'ready' }; // emitted once after model warm-up completes
 ```

 The `ready` message allows the server startup sequence to defer routing any embed requests until the model is loaded, preventing a race on first-run.
--- a/docs/features/TRUEREF-0023.md
+++ b/docs/features/TRUEREF-0023.md
@@ -0,0 +1,955 @@
+# TRUEREF-0023 — libSQL Migration, Native Vector Search, Parallel Tag Indexing, and Performance Hardening
+
+**Priority:** P1
+**Status:** Draft
+**Depends On:** TRUEREF-0001, TRUEREF-0022
+**Blocks:** —
+
+---
+
+## Overview
+
+TrueRef currently uses `better-sqlite3` for all database access. This creates three compounding performance problems:
+
+1. **Vector search does not scale.** `VectorSearch.vectorSearch()` loads the entire `snippet_embeddings` table for a repository into Node.js memory and computes cosine similarity in a JavaScript loop. A repository with 100k snippets at 1536 OpenAI dimensions allocates ~600 MB per query and ties up the worker thread for seconds before returning results.
+2. **Missing composite indexes cause table scans on every query.** The schema defines FK columns used in every search and embedding filter, but declares zero composite or covering indexes on them. Every call to `searchSnippets`, `findSnippetIdsMissingEmbeddings`, and `cloneFromAncestor` performs full or near-full table scans.
+3. **SQLite connection is under-configured.** Critical pragmas (`synchronous`, `cache_size`, `mmap_size`, `temp_store`) are absent, leaving significant I/O throughput on the table.
+
+The solution is to replace `better-sqlite3` with `@libsql/better-sqlite3` — an embeddable, drop-in synchronous replacement that is a superset of the better-sqlite3 API and exposes libSQL's native vector index (`libsql_vector_idx`). Because the API is identical, no service layer or ORM code changes are needed beyond import statements and the vector search implementation.
+
+Two additional structural improvements are delivered in the same feature:
+
+4. **Per-repo job serialization is too coarse.** `WorkerPool` prevents any two jobs sharing the same `repositoryId` from running in parallel. This means indexing 200 tags of a single library is fully sequential — one tag at a time — even though different tags write to entirely disjoint row sets. The constraint should track `(repositoryId, versionId)` pairs instead.
+5. **Write lock contention under parallel indexing.** When multiple parse workers flush parsed snippets simultaneously they all compete for the SQLite write lock, spending most of their time in `busy_timeout` back-off. A single dedicated write worker eliminates this: parse workers become pure CPU workers (crawl → parse → send batches over `postMessage`) and the write worker is the sole DB writer.
+6. **Admin UI is unusable under load.** The job queue page has no status or repository filters, no worker status panel, no skeleton loading, uses blocking `alert()` / `confirm()` dialogs, and `IndexingProgress` still polls every 2 seconds instead of consuming the existing SSE stream.
+
+---
+
+## Goals
+
+1. Replace `better-sqlite3` with `@libsql/better-sqlite3` with minimal code churn — import paths only.
+2. Add a libSQL vector index on `snippet_embeddings` so that KNN queries execute inside SQLite instead of in a JavaScript loop.
+3. Add the six composite and covering indexes required by the hot query paths.
+4. Tune the SQLite pragma configuration for I/O performance.
+5. Eliminate the leading cause of OOM risk during semantic search.
+6. Keep a single embedded database file — no external server, no network.
+7. Allow multiple tags of the same repository to index in parallel (unrelated version rows, no write conflict).
+8. Eliminate write-lock contention between parallel parse workers by introducing a single dedicated write worker.
+9. Rebuild the admin jobs page with full filtering (status, repository, free-text), a live worker status panel, skeleton loading on initial fetch, per-action inline spinners, non-blocking toast notifications, and SSE-driven real-time updates throughout.
+
+---
+
+## Non-Goals
+
+- Migrating to the async `@libsql/client` package (HTTP/embedded-replica mode).
+- Changing the Drizzle ORM adapter (`drizzle-orm/better-sqlite3` stays unchanged).
+- Changing `drizzle.config.ts` dialect (`sqlite` is still correct for embedded libSQL).
+- Adding hybrid/approximate indexing beyond the default HNSW strategy provided by `libsql_vector_idx`.
+- Parallelizing embedding batches across providers (separate feature).
+- Horizontally scaling across processes.
+- Allowing more than one job for the exact same `(repositoryId, versionId)` pair to run concurrently (still serialized — duplicate detection in `JobQueue` is unchanged).
+- A full admin authentication system (out of scope).
+- Mobile-responsive redesign of the entire admin section (out of scope).
+
+---
+
+## Problem Detail
+
+### 1. Vector Search — Full Table Scan in JavaScript
+
+**File:** `src/lib/server/search/vector.search.ts`
+
+```typescript
+// Current: no LIMIT, loads ALL embeddings for repo into memory
+const rows = this.db.prepare<unknown[], RawEmbeddingRow>(sql).all(...params);
+
+const scored: VectorSearchResult[] = rows.map((row) => {
+	const embedding = new Float32Array(
+		row.embedding.buffer,
+		row.embedding.byteOffset,
+		row.embedding.byteLength / 4
+	);
+	return { snippetId: row.snippet_id, score: cosineSimilarity(queryEmbedding, embedding) };
+});
+
+return scored.sort((a, b) => b.score - a.score).slice(0, limit);
+```
+
+For a repo with N snippets and D dimensions, this allocates `N × D × 4` bytes per query. At N=100k and D=1536, that is ~600 MB allocated synchronously. The result is sorted entirely in JS before the top-k is returned. With a native vector index, SQLite returns only the top-k rows.
+
+### 2. Missing Composite Indexes
+
+The `snippets`, `documents`, and `snippet_embeddings` tables are queried with multi-column WHERE predicates in every hot path, but no composite indexes exist:
+
+| Table                | Filter columns                | Used in                                        |
+| -------------------- | ----------------------------- | ---------------------------------------------- |
+| `snippets`           | `(repository_id, version_id)` | All search, diff, clone                        |
+| `snippets`           | `(repository_id, type)`       | Type-filtered queries                          |
+| `documents`          | `(repository_id, version_id)` | Diff strategy, clone                           |
+| `snippet_embeddings` | `(profile_id, snippet_id)`    | `findSnippetIdsMissingEmbeddings` LEFT JOIN    |
+| `repositories`       | `(state)`                     | `searchRepositories` WHERE `state = 'indexed'` |
+| `indexing_jobs`      | `(repository_id, status)`     | Job status lookups                             |
+
+Without these indexes, SQLite performs a B-tree scan of the primary key and filters rows in memory. On a 500k-row `snippets` table this is the dominant cost of every search.
+
+### 4. Admin UI — Current Problems
+
+**File:** `src/routes/admin/jobs/+page.svelte`, `src/lib/components/IndexingProgress.svelte`
+
+| Problem                                                        | Location                                  | Impact                                                       |
+| -------------------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------ |
+| `IndexingProgress` polls every 2 s via `setInterval` + `fetch` | `IndexingProgress.svelte`                 | Constant HTTP traffic; progress lags by up to 2 s            |
+| No status or repository filter controls                        | `admin/jobs/+page.svelte`                 | With 200 tag jobs, finding a specific one requires scrolling |
+| No worker status panel                                         | — (no endpoint exists)                    | Operator cannot see which workers are busy or idle           |
+| `alert()` for errors, `confirm()` for cancel                   | `admin/jobs/+page.svelte` — `showToast()` | Blocks the entire browser tab; unusable under parallel jobs  |
+| `actionInProgress` is a single string, not per-job             | `admin/jobs/+page.svelte`                 | Pausing job A disables buttons on all other jobs             |
+| No skeleton loading — blank + spinner on first load            | `admin/jobs/+page.svelte`                 | Layout shift; no structural preview while data loads         |
+| Hard-coded `limit=50` query, no pagination                     | `admin/jobs/+page.svelte:fetchJobs()`     | Page truncates silently for large queues                     |
+
+---
+
+### 3. Under-configured SQLite Connection
+
+**File:** `src/lib/server/db/client.ts` and `src/lib/server/db/index.ts`
+
+Current pragmas:
+
+```typescript
+client.pragma('journal_mode = WAL');
+client.pragma('foreign_keys = ON');
+client.pragma('busy_timeout = 5000');
+```
+
+Missing:
+
+- `synchronous = NORMAL` — halves fsync overhead vs the default FULL; safe with WAL
+- `cache_size = -65536` — 64 MB page cache; default is 2 MB
+- `temp_store = MEMORY` — temp tables and sort spills stay in RAM
+- `mmap_size = 268435456` — 256 MB memory-mapped read path; bypasses system call overhead for reads
+- `wal_autocheckpoint = 1000` — more frequent checkpoints prevent WAL growth
+
+---
+
+## Architecture
+
+### Drop-In Replacement: `@libsql/better-sqlite3`
+
+`@libsql/better-sqlite3` is published by Turso and implemented as a Node.js native addon wrapping the libSQL embedded engine. The exported class is API-compatible with `better-sqlite3`:
+
+```typescript
+// before
+import Database from 'better-sqlite3';
+const db = new Database('/path/to/file.db');
+db.pragma('journal_mode = WAL');
+const rows = db.prepare('SELECT ...').all(...params);
+
+// after — identical code
+import Database from '@libsql/better-sqlite3';
+const db = new Database('/path/to/file.db');
+db.pragma('journal_mode = WAL');
+const rows = db.prepare('SELECT ...').all(...params);
+```
+
+All of the following continue to work unchanged:
+
+- `drizzle-orm/better-sqlite3` adapter and `migrate` helper
+- `drizzle-kit` with `dialect: 'sqlite'`
+- Prepared statements, transactions, WAL pragmas, foreign keys
+- Worker thread per-thread connections (`worker-entry.ts`, `embed-worker-entry.ts`)
+- All `type Database from 'better-sqlite3'` type imports (replaced in lock-step)
+
+### Vector Index Design
+
+libSQL provides `libsql_vector_idx()` — a virtual index type stored in a shadow table alongside the main table. Once indexed, KNN queries use a SQL `vector_top_k()` function:
+
+```sql
+-- KNN: return top-k snippet IDs closest to the query vector
+SELECT snippet_id
+FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?)
+```
+
+`vector_from_float32(blob)` accepts the same raw little-endian Float32 bytes currently stored in the `embedding` blob column. **No data migration is needed** — the existing blob column can be re-indexed with `libsql_vector_idx` pointing at the bytes-stored column.
+
+The index strategy:
+
+1. Add a generated `vec_embedding` column of type `F32_BLOB(dimensions)` to `snippet_embeddings`, populated from the existing `embedding` blob via a migration trigger.
+2. Create the vector index: `CREATE INDEX idx_snippet_embeddings_vec ON snippet_embeddings(vec_embedding) USING libsql_vector_idx(vec_embedding)`.
+3. Rewrite `VectorSearch.vectorSearch()` to use `vector_top_k()` with a two-step join instead of the in-memory loop.
+4. Update `EmbeddingService.embedSnippets()` to write `vec_embedding` on insert.
+
+Dimensions are profile-specific. Because the index is per-column, a separate index is needed per embedding dimensionality. For v1, a single index covering the default profile's dimensions is sufficient; multi-profile KNN can be handled with a `WHERE profile_id = ?` pre-filter on the vector_top_k results.
+
+### Updated Vector Search Query
+
+```typescript
+vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
+    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
+
+    // Encode query vector as raw bytes (same format as stored blobs)
+    const queryBytes = Buffer.from(queryEmbedding.buffer);
+
+    // Use libSQL vector_top_k for ANN — returns ordered (rowid, distance) pairs
+    let sql = `
+        SELECT se.snippet_id,
+               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS score
+        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
+        JOIN snippet_embeddings se ON se.rowid = knn.id
+        JOIN snippets s ON s.id = se.snippet_id
+        WHERE s.repository_id = ?
+          AND se.profile_id = ?
+    `;
+    const params: unknown[] = [queryBytes, queryBytes, limit * 4, repositoryId, profileId];
+
+    if (versionId) {
+        sql += ' AND s.version_id = ?';
+        params.push(versionId);
+    }
+
+    sql += ' ORDER BY score ASC LIMIT ?';
+    params.push(limit);
+
+    return this.db
+        .prepare<unknown[], { snippet_id: string; score: number }>(sql)
+        .all(...params)
+        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.score }));
+}
+```
+
+`vector_distance_cos` returns distance (0 = identical), so `1 - distance` gives a similarity score in [0, 1] matching the existing `VectorSearchResult.score` contract.
+
+---
+
+## Implementation Plan
+
+### Phase 1 — Package Swap (no logic changes)
+
+**Files touched:** `package.json`, all `.ts` files that import `better-sqlite3`
+
+1. In `package.json`:
+   - Remove `"better-sqlite3": "^12.6.2"` from `dependencies`
+   - Add `"@libsql/better-sqlite3": "^0.4.0"` to `dependencies`
+   - Remove `"@types/better-sqlite3": "^7.6.13"` from `devDependencies`
+   - `@libsql/better-sqlite3` ships its own TypeScript declarations
+
+2. Replace all import statements (35 occurrences across 19 files):
+
+   | Old import                                                      | New import                                           |
+   | --------------------------------------------------------------- | ---------------------------------------------------- |
+   | `import Database from 'better-sqlite3'`                         | `import Database from '@libsql/better-sqlite3'`      |
+   | `import type Database from 'better-sqlite3'`                    | `import type Database from '@libsql/better-sqlite3'` |
+   | `import { drizzle } from 'drizzle-orm/better-sqlite3'`          | unchanged                                            |
+   | `import { migrate } from 'drizzle-orm/better-sqlite3/migrator'` | unchanged                                            |
+
+   Affected production files:
+   - `src/lib/server/db/index.ts`
+   - `src/lib/server/db/client.ts`
+   - `src/lib/server/embeddings/embedding.service.ts`
+   - `src/lib/server/pipeline/indexing.pipeline.ts`
+   - `src/lib/server/pipeline/job-queue.ts`
+   - `src/lib/server/pipeline/startup.ts`
+   - `src/lib/server/pipeline/worker-entry.ts`
+   - `src/lib/server/pipeline/embed-worker-entry.ts`
+   - `src/lib/server/pipeline/differential-strategy.ts`
+   - `src/lib/server/search/vector.search.ts`
+   - `src/lib/server/search/hybrid.search.service.ts`
+   - `src/lib/server/search/search.service.ts`
+   - `src/lib/server/services/repository.service.ts`
+   - `src/lib/server/services/version.service.ts`
+   - `src/lib/server/services/embedding-settings.service.ts`
+
+   Affected test files (same mechanical replacement):
+   - `src/routes/api/v1/api-contract.integration.test.ts`
+   - `src/routes/api/v1/sse-and-settings.integration.test.ts`
+   - `src/routes/settings/page.server.test.ts`
+   - `src/lib/server/db/schema.test.ts`
+   - `src/lib/server/embeddings/embedding.service.test.ts`
+   - `src/lib/server/pipeline/indexing.pipeline.test.ts`
+   - `src/lib/server/pipeline/differential-strategy.test.ts`
+   - `src/lib/server/search/search.service.test.ts`
+   - `src/lib/server/search/hybrid.search.service.test.ts`
+   - `src/lib/server/services/repository.service.test.ts`
+   - `src/lib/server/services/version.service.test.ts`
+   - `src/routes/api/v1/settings/embedding/server.test.ts`
+   - `src/routes/api/v1/libs/[id]/index/server.test.ts`
+   - `src/routes/api/v1/libs/[id]/versions/discover/server.test.ts`
+
+3. Run all tests — they should pass with zero logic changes: `npm test`
+
+### Phase 2 — Pragma Hardening
+
+**Files touched:** `src/lib/server/db/client.ts`, `src/lib/server/db/index.ts`
+
+Add the following pragmas to both connection factories (raw client and `initializeDatabase()`):
+
+```typescript
+client.pragma('synchronous = NORMAL');
+client.pragma('cache_size = -65536'); // 64 MB
+client.pragma('temp_store = MEMORY');
+client.pragma('mmap_size = 268435456'); // 256 MB
+client.pragma('wal_autocheckpoint = 1000');
+```
+
+Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) open their own connections — apply the same pragmas there.
+
+### Phase 3 — Composite Indexes (Drizzle migration)
+
+**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL file
+
+Add indexes in `schema.ts` using Drizzle's `index()` helper:
+
+```typescript
+// snippets table
+export const snippets = sqliteTable(
+	'snippets',
+	{
+		/* unchanged */
+	},
+	(t) => [
+		index('idx_snippets_repo_version').on(t.repositoryId, t.versionId),
+		index('idx_snippets_repo_type').on(t.repositoryId, t.type)
+	]
+);
+
+// documents table
+export const documents = sqliteTable(
+	'documents',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_documents_repo_version').on(t.repositoryId, t.versionId)]
+);
+
+// snippet_embeddings table
+export const snippetEmbeddings = sqliteTable(
+	'snippet_embeddings',
+	{
+		/* unchanged */
+	},
+	(table) => [
+		primaryKey({ columns: [table.snippetId, table.profileId] }), // unchanged
+		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
+	]
+);
+
+// repositories table
+export const repositories = sqliteTable(
+	'repositories',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_repositories_state').on(t.state)]
+);
+
+// indexing_jobs table
+export const indexingJobs = sqliteTable(
+	'indexing_jobs',
+	{
+		/* unchanged */
+	},
+	(t) => [index('idx_jobs_repo_status').on(t.repositoryId, t.status)]
+);
+```
+
+Generate and apply migration: `npm run db:generate && npm run db:migrate`
+
+### Phase 4 — Vector Column and Index (Drizzle migration)
+
+**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL, `src/lib/server/search/vector.search.ts`, `src/lib/server/embeddings/embedding.service.ts`
+
+#### 4a. Schema: add `vec_embedding` column
+
+Add `vec_embedding` to `snippet_embeddings`. Drizzle does not have a `F32_BLOB` column type helper; use a raw SQL column:
+
+```typescript
+import { sql } from 'drizzle-orm';
+import { customType } from 'drizzle-orm/sqlite-core';
+
+const f32Blob = (name: string, dimensions: number) =>
+	customType<{ data: Buffer }>({
+		dataType() {
+			return `F32_BLOB(${dimensions})`;
+		}
+	})(name);
+
+export const snippetEmbeddings = sqliteTable(
+	'snippet_embeddings',
+	{
+		snippetId: text('snippet_id')
+			.notNull()
+			.references(() => snippets.id, { onDelete: 'cascade' }),
+		profileId: text('profile_id')
+			.notNull()
+			.references(() => embeddingProfiles.id, { onDelete: 'cascade' }),
+		model: text('model').notNull(),
+		dimensions: integer('dimensions').notNull(),
+		embedding: blob('embedding').notNull(), // existing blob — kept for backward compat
+		vecEmbedding: f32Blob('vec_embedding', 1536), // libSQL vector column (nullable during migration fill)
+		createdAt: integer('created_at').notNull()
+	},
+	(table) => [
+		primaryKey({ columns: [table.snippetId, table.profileId] }),
+		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
+	]
+);
+```
+
+Because dimensionality is fixed per model, `F32_BLOB(1536)` covers OpenAI `text-embedding-3-small/large`. A follow-up can parameterize this per profile.
+
+#### 4b. Migration SQL: populate `vec_embedding` from existing `embedding` blob and create the vector index
+
+The vector index cannot be expressed in SQL DDL portable across Drizzle — it must be applied in the FTS-style custom SQL file (`src/lib/server/db/fts.sql` or an equivalent `vectors.sql`):
+
+```sql
+-- Backfill vec_embedding from existing raw blob data
+UPDATE snippet_embeddings
+SET vec_embedding = vector_from_float32(embedding)
+WHERE vec_embedding IS NULL AND embedding IS NOT NULL;
+
+-- Create the HNSW vector index (libSQL extension syntax)
+CREATE INDEX IF NOT EXISTS idx_snippet_embeddings_vec
+ON snippet_embeddings(vec_embedding)
+USING libsql_vector_idx(vec_embedding, 'metric=cosine', 'compress_neighbors=float8', 'max_neighbors=20');
+```
+
+Add a call to this SQL in `initializeDatabase()` alongside the existing `fts.sql` execution:
+
+```typescript
+const vectorSql = readFileSync(join(__dirname, 'vectors.sql'), 'utf-8');
+client.exec(vectorSql);
+```
+
+#### 4c. Update `EmbeddingService.embedSnippets()`
+
+When inserting a new embedding, write both the blob and the vec column:
+
+```typescript
+const insert = this.db.prepare<[string, string, string, number, Buffer, Buffer]>(`
+    INSERT OR REPLACE INTO snippet_embeddings
+        (snippet_id, profile_id, model, dimensions, embedding, vec_embedding, created_at)
+    VALUES (?, ?, ?, ?, ?, vector_from_float32(?), unixepoch())
+`);
+
+// inside the transaction:
+insert.run(
+	snippet.id,
+	this.profileId,
+	embedding.model,
+	embedding.dimensions,
+	embeddingBuffer,
+	embeddingBuffer // same bytes — vector_from_float32() interprets them
+);
+```
+
+#### 4d. Rewrite `VectorSearch.vectorSearch()`
+
+Replace the full-scan JS loop with `vector_top_k()`:
+
+```typescript
+vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
+    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
+
+    const queryBytes = Buffer.from(queryEmbedding.buffer);
+    const candidatePool = limit * 4; // over-fetch for post-filter
+
+    let sql = `
+        SELECT se.snippet_id,
+               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS distance
+        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
+        JOIN snippet_embeddings se ON se.rowid = knn.id
+        JOIN snippets s ON s.id = se.snippet_id
+        WHERE s.repository_id = ?
+          AND se.profile_id = ?
+    `;
+    const params: unknown[] = [queryBytes, queryBytes, candidatePool, repositoryId, profileId];
+
+    if (versionId) {
+        sql += ' AND s.version_id = ?';
+        params.push(versionId);
+    }
+
+    sql += ' ORDER BY distance ASC LIMIT ?';
+    params.push(limit);
+
+    return this.db
+        .prepare<unknown[], { snippet_id: string; distance: number }>(sql)
+        .all(...params)
+        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.distance }));
+}
+```
+
+The `score` contract is preserved (1 = identical, 0 = orthogonal). The `cosineSimilarity` helper function is no longer called at runtime but can be kept for unit tests.
+
+### Phase 5 — Per-Job Serialization Key Fix
+
+**Files touched:** `src/lib/server/pipeline/worker-pool.ts`
+
+The current serialization guard uses a bare `repositoryId`:
+
+```typescript
+// current
+private runningRepoIds = new Set<string>();
+// blocks any job whose repositoryId is already in the set
+const jobIdx = this.jobQueue.findIndex((j) => !this.runningRepoIds.has(j.repositoryId));
+```
+
+Different tags of the same repository write to completely disjoint rows (`version_id`-partitioned documents, snippets, and embeddings). The only genuine conflict is two jobs for the same `(repositoryId, versionId)` pair, which `JobQueue.enqueue()` already prevents via the `status IN ('queued', 'running')` deduplication check.
+
+Change the guard to key on the compound pair:
+
+```typescript
+// replace Set<string> with Set<string> keyed on compound pair
+private runningJobKeys = new Set<string>();
+
+private jobKey(repositoryId: string, versionId?: string | null): string {
+    return `${repositoryId}|${versionId ?? ''}`;
+}
+```
+
+Update all four sites that read/write `runningRepoIds`:
+
+| Location                             | Old                                                   | New                                                                                      |
+| ------------------------------------ | ----------------------------------------------------- | ---------------------------------------------------------------------------------------- |
+| `dispatch()` find                    | `!this.runningRepoIds.has(j.repositoryId)`            | `!this.runningJobKeys.has(this.jobKey(j.repositoryId, j.versionId))`                     |
+| `dispatch()` add                     | `this.runningRepoIds.add(job.repositoryId)`           | `this.runningJobKeys.add(this.jobKey(job.repositoryId, job.versionId))`                  |
+| `onWorkerMessage` done/failed delete | `this.runningRepoIds.delete(runningJob.repositoryId)` | `this.runningJobKeys.delete(this.jobKey(runningJob.repositoryId, runningJob.versionId))` |
+| `onWorkerExit` delete                | same                                                  | same                                                                                     |
+
+The `QueuedJob` and `RunningJob` interfaces already carry `versionId` — no type changes needed.
+
+The only serialized case that remains is `versionId = null` (default-branch re-index) paired with itself, which maps to the stable key `"repositoryId|"` — correctly deduplicated.
+
+---
+
+### Phase 6 — Dedicated Write Worker (Single-Writer Pattern)
+
+**Files touched:** `src/lib/server/pipeline/worker-types.ts`, `src/lib/server/pipeline/write-worker-entry.ts` (new), `src/lib/server/pipeline/worker-entry.ts`, `src/lib/server/pipeline/worker-pool.ts`
+
+#### Motivation
+
+With Phase 5 in place, N tags of the same library can index in parallel. Each parse worker currently opens its own DB connection and holds the write lock while storing parsed snippets. Under N concurrent writers, each worker spends the majority of its wall-clock time waiting in `busy_timeout` back-off. The fix is the single-writer pattern: one dedicated write worker owns the only writable DB connection; parse workers become stateless CPU workers that send write batches over `postMessage`.
+
+```
+Parse Worker 1 ──┐ WriteRequest (docs[], snippets[])    ┌── WriteAck
+Parse Worker 2 ──┼─────────────────────────────────────► Write Worker (sole DB writer)
+Parse Worker N ──┘                                       └── single better-sqlite3 connection
+```
+
+#### New message types (`worker-types.ts`)
+
+```typescript
+export interface WriteRequest {
+	type: 'write';
+	jobId: string;
+	documents: SerializedDocument[];
+	snippets: SerializedSnippet[];
+}
+
+export interface WriteAck {
+	type: 'write_ack';
+	jobId: string;
+	documentCount: number;
+	snippetCount: number;
+}
+
+export interface WriteError {
+	type: 'write_error';
+	jobId: string;
+	error: string;
+}
+
+// SerializedDocument / SerializedSnippet mirror the DB column shapes
+// (plain objects, safe to transfer via structured clone)
+```
+
+#### Write worker (`write-worker-entry.ts`)
+
+The write worker:
+
+- Opens its own `Database` connection (WAL mode, all pragmas from Phase 2)
+- Listens for `WriteRequest` messages
+- Wraps each batch in a single transaction
+- Posts `WriteAck` or `WriteError` back to the parent, which forwards the ack to the originating parse worker by `jobId`
+
+```typescript
+import Database from '@libsql/better-sqlite3';
+import { workerData, parentPort } from 'node:worker_threads';
+import type { WriteRequest, WriteAck, WriteError } from './worker-types.js';
+
+const db = new Database((workerData as WorkerInitData).dbPath);
+db.pragma('journal_mode = WAL');
+db.pragma('synchronous = NORMAL');
+db.pragma('cache_size = -65536');
+db.pragma('foreign_keys = ON');
+
+const insertDoc = db.prepare(`INSERT OR REPLACE INTO documents     (...) VALUES (...)`);
+const insertSnippet = db.prepare(`INSERT OR REPLACE INTO snippets      (...) VALUES (...)`);
+
+const writeBatch = db.transaction((req: WriteRequest) => {
+	for (const doc of req.documents) insertDoc.run(doc);
+	for (const snip of req.snippets) insertSnippet.run(snip);
+});
+
+parentPort!.on('message', (req: WriteRequest) => {
+	try {
+		writeBatch(req);
+		const ack: WriteAck = {
+			type: 'write_ack',
+			jobId: req.jobId,
+			documentCount: req.documents.length,
+			snippetCount: req.snippets.length
+		};
+		parentPort!.postMessage(ack);
+	} catch (err) {
+		const fail: WriteError = { type: 'write_error', jobId: req.jobId, error: String(err) };
+		parentPort!.postMessage(fail);
+	}
+});
+```
+
+#### Parse worker changes (`worker-entry.ts`)
+
+Parse workers lose their DB connection. `IndexingPipeline` receives a `sendWrite` callback instead of a `db` instance. After parsing each file batch, the worker calls `sendWrite({ type: 'write', jobId, documents, snippets })` and awaits the `WriteAck` before continuing. This keeps back-pressure: a slow write worker naturally throttles the parse workers without additional semaphores.
+
+#### WorkerPool changes
+
+- Spawn one write worker at startup (always, regardless of embedding config)
+- Route incoming `write_ack` / `write_error` messages to the correct waiting parse worker via a `Map<jobId, resolve>` promise registry
+- The write worker is separate from the embed worker — embed writes (`snippet_embeddings`) can still go through the write worker by adding an `EmbedWriteRequest` message type, or remain in the embed worker since embedding runs after parsing completes (no lock contention with active parse jobs)
+
+#### Conflict analysis with Phase 5
+
+Phases 5 and 6 compose cleanly:
+
+- Phase 5 allows multiple `(repo, versionId)` jobs to run concurrently
+- Phase 6 ensures all those concurrent jobs share a single write path — contention is eliminated by design
+- The write worker is stateless with respect to job identity; it just executes batches in arrival order within a FIFO message queue (Node.js `postMessage` is ordered)
+- The embed worker remains a separate process (it runs after parse completes, so it never overlaps with active parse writes for the same job)
+
+---
+
+### Phase 7 — Admin UI Overhaul
+
+**Files touched:**
+
+- `src/routes/admin/jobs/+page.svelte` — rebuilt
+- `src/routes/api/v1/workers/+server.ts` — new endpoint
+- `src/lib/components/admin/JobStatusBadge.svelte` — extend with spinner variant
+- `src/lib/components/admin/JobSkeleton.svelte` — new
+- `src/lib/components/admin/WorkerStatusPanel.svelte` — new
+- `src/lib/components/admin/Toast.svelte` — new
+- `src/lib/components/IndexingProgress.svelte` — switch to SSE
+
+#### 7a. New API endpoint: `GET /api/v1/workers`
+
+The `WorkerPool` singleton tracks running jobs in `runningJobs: Map<Worker, RunningJob>` and idle workers in `idleWorkers: Worker[]`. Expose this state as a lightweight REST snapshot:
+
+```typescript
+// GET /api/v1/workers
+// Response shape:
+interface WorkersResponse {
+	concurrency: number; // configured max workers
+	active: number; // workers with a running job
+	idle: number; // workers waiting for work
+	workers: WorkerStatus[]; // one entry per spawned parse worker
+}
+
+interface WorkerStatus {
+	index: number; // worker slot (0-based)
+	state: 'idle' | 'running'; // current state
+	jobId: string | null; // null when idle
+	repositoryId: string | null;
+	versionId: string | null;
+}
+```
+
+The route handler calls `getPool().getStatus()` — add a `getStatus(): WorkersResponse` method to `WorkerPool` that reads `runningJobs` and `idleWorkers` without any DB call. This is read-only and runs on the main thread.
+
+The SSE stream at `/api/v1/jobs/stream` should emit a new `worker-status` event type whenever a worker transitions idle ↔ running (on `dispatch()` and job completion). This allows the worker panel to update in real-time without polling the REST endpoint.
+
+#### 7b. `GET /api/v1/jobs` — add `repositoryId` free-text and multi-status filter
+
+The existing endpoint already accepts `repositoryId` (exact match) and `status` (single value). Extend:
+
+- `repositoryId` to also support prefix match (e.g. `?repositoryId=/facebook` returns all `/facebook/*` repos)
+- `status` to accept comma-separated values: `?status=queued,running`
+- `page` and `pageSize` query params (default pageSize=50, max 200) in addition to `limit` for backwards compat
+
+Return `{ jobs, total, page, pageSize }` with `total` always reflecting the unfiltered-by-page count.
+
+#### 7c. New component: `JobSkeleton.svelte`
+
+A set of skeleton rows matching the job table structure. Shown during the initial fetch before any data arrives. Uses Tailwind `animate-pulse`:
+
+```svelte
+<!-- renders N skeleton rows -->
+<script lang="ts">
+	let { rows = 5 }: { rows?: number } = $props();
+</script>
+
+{#each Array(rows) as _, i (i)}
+	<tr>
+		<td class="px-6 py-4">
+			<div class="h-4 w-48 animate-pulse rounded bg-gray-200"></div>
+			<div class="mt-1 h-3 w-24 animate-pulse rounded bg-gray-100"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-5 w-16 animate-pulse rounded-full bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-4 w-20 animate-pulse rounded bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-2 w-32 animate-pulse rounded-full bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4">
+			<div class="h-4 w-28 animate-pulse rounded bg-gray-200"></div>
+		</td>
+		<td class="px-6 py-4 text-right">
+			<div class="ml-auto h-7 w-20 animate-pulse rounded bg-gray-200"></div>
+		</td>
+	</tr>
+{/each}
+```
+
+#### 7d. New component: `Toast.svelte`
+
+Replaces all `alert()` / `console.log()` calls in the jobs page. Renders a fixed-position stack in the bottom-right corner. Each toast auto-dismisses after 4 seconds and can be manually closed:
+
+```svelte
+<!-- Usage: bind a toasts array and call push({ message, type }) -->
+<script lang="ts">
+	export interface ToastItem {
+		id: string;
+		message: string;
+		type: 'success' | 'error' | 'info';
+	}
+
+	let { toasts = $bindable([]) }: { toasts: ToastItem[] } = $props();
+
+	function dismiss(id: string) {
+		toasts = toasts.filter((t) => t.id !== id);
+	}
+</script>
+
+<div class="fixed right-4 bottom-4 z-50 flex flex-col gap-2">
+	{#each toasts as toast (toast.id)}
+		<!-- color by type, close button, auto-dismiss via onmount timer -->
+	{/each}
+</div>
+```
+
+The jobs page replaces `showToast()` with pushing onto the bound `toasts` array. The `confirm()` for cancel is replaced with an inline confirmation state per job (`pendingCancelId`) that shows "Confirm cancel?" / "Yes" / "No" buttons inside the row.
+
+#### 7e. New component: `WorkerStatusPanel.svelte`
+
+A compact panel displayed above the job table showing the worker pool health. Subscribes to the `worker-status` SSE events and falls back to polling `GET /api/v1/workers` every 5 s on SSE error:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│  Workers  [2 / 4 active]  ████░░░░  50%                 │
+│  Worker 0  ● running  /facebook/react / v18.3.0         │
+│  Worker 1  ● running  /facebook/react / v17.0.2         │
+│  Worker 2  ○ idle                                       │
+│  Worker 3  ○ idle                                       │
+└─────────────────────────────────────────────────────────┘
+```
+
+Each worker row shows: slot index, status dot (animated green pulse for running), repository ID, version tag, and a link to the job row in the table below.
+
+#### 7f. Filter bar on the jobs page
+
+Add a filter strip between the page header and the table:
+
+```
+[ Repository: _______________ ]  [ Status: ▾ all ]  [ 🔍 Apply ]  [ ↺ Reset ]
+```
+
+- **Repository field**: free-text input, matches `repositoryId` prefix (e.g. `/facebook` shows all `/facebook/*`)
+- **Status dropdown**: multi-select checkboxes for `queued`, `running`, `paused`, `cancelled`, `done`, `failed`; default = all
+- Filters are applied client-side against the loaded `jobs` array for instant feedback, and also re-fetched from the API on Apply to get the correct total count
+- Filter state is mirrored to URL search params (`?repo=...&status=...`) so the view is bookmarkable and survives refresh
+
+#### 7g. Per-job action spinner and disabled state
+
+Replace the single `actionInProgress: string | null` with a `Map<string, 'pausing' | 'resuming' | 'cancelling'>`:
+
+```typescript
+let actionInProgress = $state(new Map<string, 'pausing' | 'resuming' | 'cancelling'>());
+```
+
+Each action button shows an inline spinner (small `animate-spin` circle) and is disabled only for that row. Other rows remain fully interactive during the action. On completion the entry is deleted from the map.
+
+#### 7h. `IndexingProgress.svelte` — switch from polling to SSE
+
+The component currently uses `setInterval + fetch` at 2 s. Replace with the per-job SSE stream already available at `/api/v1/jobs/{id}/stream`:
+
+```typescript
+// replace the $effect body
+$effect(() => {
+	job = null;
+	const es = new EventSource(`/api/v1/jobs/${jobId}/stream`);
+
+	es.addEventListener('job-progress', (event) => {
+		const data = JSON.parse(event.data);
+		job = { ...job, ...data };
+	});
+
+	es.addEventListener('job-done', () => {
+		void fetch(`/api/v1/jobs/${jobId}`)
+			.then((r) => r.json())
+			.then((d) => {
+				job = d.job;
+				oncomplete?.();
+			});
+		es.close();
+	});
+
+	es.addEventListener('job-failed', (event) => {
+		const data = JSON.parse(event.data);
+		job = { ...job, status: 'failed', error: data.error };
+		oncomplete?.();
+		es.close();
+	});
+
+	es.onerror = () => {
+		// on SSE failure fall back to a single fetch to get current state
+		es.close();
+		void fetch(`/api/v1/jobs/${jobId}`)
+			.then((r) => r.json())
+			.then((d) => {
+				job = d.job;
+			});
+	};
+
+	return () => es.close();
+});
+```
+
+This reduces network traffic from 1 request/2 s to zero requests during active indexing — updates arrive as server-push events.
+
+#### 7i. Pagination on the jobs page
+
+Replace the hard-coded `?limit=50` fetch with paginated requests:
+
+```typescript
+let currentPage = $state(1);
+const PAGE_SIZE = 50;
+
+async function fetchJobs() {
+	const params = new URLSearchParams({
+		page: String(currentPage),
+		pageSize: String(PAGE_SIZE),
+		...(filterRepo ? { repositoryId: filterRepo } : {}),
+		...(filterStatuses.length ? { status: filterStatuses.join(',') } : {})
+	});
+	const data = await fetch(`/api/v1/jobs?${params}`).then((r) => r.json());
+	jobs = data.jobs;
+	total = data.total;
+}
+```
+
+Render a simple `« Prev  Page N of M  Next »` control below the table, hidden when `total <= PAGE_SIZE`.
+
+---
+
+## Acceptance Criteria
+
+- [ ] `npm install` with `@libsql/better-sqlite3` succeeds; `better-sqlite3` is absent from `node_modules`
+- [ ] All existing unit and integration tests pass after Phase 1 import swap
+- [ ] `npm run db:migrate` applies the composite index migration cleanly against an existing database
+- [ ] `npm run db:migrate` applies the vector column migration cleanly; `sql> SELECT vec_embedding FROM snippet_embeddings LIMIT 1` returns a non-NULL value for any previously-embedded snippet
+- [ ] `GET /api/v1/context?libraryId=...&query=...` with a semantic-mode or hybrid-mode request returns results in ≤ 200 ms on a repository with 50k+ snippets (vs previous multi-second response)
+- [ ] Memory profiled during a /context request shows no allocation spike proportional to repository size
+- [ ] `EXPLAIN QUERY PLAN` on the `snippets` search query shows `SCAN snippets USING INDEX idx_snippets_repo_version` instead of `SCAN snippets`
+- [ ] Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) start and complete an indexing job successfully after the package swap
+- [ ] `drizzle-kit studio` connects and browses the migrated database
+- [ ] Re-indexing a repository after the migration correctly populates `vec_embedding` on all new snippets
+- [ ] `cosineSimilarity` unit tests still pass (function is kept)
+- [ ] Starting two indexing jobs for different tags of the same repository simultaneously results in both jobs reaching `running` state concurrently (not one waiting for the other)
+- [ ] Starting two indexing jobs for the **same** `(repositoryId, versionId)` pair returns the existing job (deduplication unchanged)
+- [ ] With 4 parse workers and 4 concurrent tag jobs, zero `SQLITE_BUSY` errors appear in logs
+- [ ] Write worker is present in the process list during active indexing (`worker_threads` inspector shows `write-worker-entry`)
+- [ ] A `WriteError` from the write worker marks the originating job as `failed` with the error message propagated to the SSE stream
+- [ ] `GET /api/v1/workers` returns a `WorkersResponse` JSON object with correct `active`, `idle`, and `workers[]` fields while jobs are in-flight
+- [ ] The `worker-status` SSE event is emitted by `/api/v1/jobs/stream` whenever a worker transitions state
+- [ ] The admin jobs page shows skeleton rows (not a blank screen) during the initial `fetchJobs()` call
+- [ ] No `alert()` or `confirm()` calls exist in `admin/jobs/+page.svelte` after this change; all notifications go through `Toast.svelte`
+- [ ] Pausing job A while job B is also in progress does not disable job B's action buttons
+- [ ] The status filter multi-select correctly restricts the visible job list; the URL updates to reflect the filter state
+- [ ] The repository prefix filter `?repositoryId=/facebook` returns all jobs whose `repositoryId` starts with `/facebook`
+- [ ] Paginating past page 1 fetches the next batch from the API, not from the client-side array
+- [ ] `IndexingProgress.svelte` has no `setInterval` call; it uses `EventSource` for progress updates
+- [ ] The `WorkerStatusPanel` shows the correct number of running workers live during a multi-tag indexing run
+- [ ] Refreshing the jobs page with `?repo=/facebook/react&status=running` pre-populates the filters and fetches with those params
+
+---
+
+## Migration Safety
+
+### Backward Compatibility
+
+The `embedding` blob column is kept. The `vec_embedding` column is nullable during the backfill window and becomes populated as:
+
+1. The `UPDATE` in `vectors.sql` fills all existing rows on startup
+2. New embeddings populate it at insert time
+
+If `vec_embedding IS NULL` for a row (e.g., a row inserted before the migration runs), the vector index silently omits that row from results. The fallback in `HybridSearchService` to FTS-only mode still applies when no embeddings exist, so degraded-but-correct behavior is preserved.
+
+### Rollback
+
+Rollback before Phase 4 (vector column): remove `@libsql/better-sqlite3`, restore `better-sqlite3`, restore imports. No schema changes have been made.
+
+Rollback after Phase 4: schema now has `vec_embedding` column. Drop the column with a migration reversal and restore imports. The `embedding` blob is intact throughout — no data loss.
+
+### SQLite File Compatibility
+
+libSQL embedded mode reads and writes standard SQLite 3 files. The WAL file, page size, and encoding are unchanged. An existing production database opened with `@libsql/better-sqlite3` is fully readable and writable. The vector index is stored in a shadow table `idx_snippet_embeddings_vec_shadow` which better-sqlite3 would ignore if rolled back (it is a regular table with a special name).
+
+---
+
+## Dependencies
+
+| Package                  | Action                        | Reason                                          |
+| ------------------------ | ----------------------------- | ----------------------------------------------- |
+| `better-sqlite3`         | Remove from `dependencies`    | Replaced                                        |
+| `@types/better-sqlite3`  | Remove from `devDependencies` | `@libsql/better-sqlite3` ships own types        |
+| `@libsql/better-sqlite3` | Add to `dependencies`         | Drop-in libSQL node addon                       |
+| `drizzle-orm`            | No change                     | `better-sqlite3` adapter works unchanged        |
+| `drizzle-kit`            | No change                     | `dialect: 'sqlite'` correct for embedded libSQL |
+
+No new runtime dependencies beyond the package replacement.
+
+---
+
+## Testing Strategy
+
+### Unit Tests
+
+- `src/lib/server/search/vector.search.ts`: add test asserting KNN results are correct for a seeded 3-vector table; verify memory is not proportional to table size (mock `db.prepare` to assert no unbounded `.all()` is called)
+- `src/lib/server/embeddings/embedding.service.ts`: existing tests cover insert round-trips; verify `vec_embedding` column is non-NULL after `embedSnippets()`
+
+### Integration Tests
+
+- `api-contract.integration.test.ts`: existing tests already use `new Database(':memory:')` — these continue to work with `@libsql/better-sqlite3` because the in-memory path is identical
+- Add one test to `api-contract.integration.test.ts`: seed a repository + multiple embeddings, call `/api/v1/context` in semantic mode, assert non-empty results and response time < 500ms on in-memory DB
+
+### UI Tests
+
+- `src/routes/admin/jobs/+page.svelte`: add Vitest browser tests (Playwright) verifying:
+  - Skeleton rows appear before the first fetch resolves (mock `fetch` to delay 200 ms)
+  - Status filter restricts displayed rows; URL param updates
+  - Pausing job A leaves job B's buttons enabled
+  - Toast appears and auto-dismisses on successful pause
+  - Cancel confirm flow shows inline confirmation, not `window.confirm`
+- `src/lib/components/IndexingProgress.svelte`: unit test that no `setInterval` is created; verify `EventSource` is opened with the correct URL
+
+### Performance Regression Gate
+
+Add a benchmark script `scripts/bench-vector-search.mjs` that:
+
+1. Creates an in-memory libSQL database
+2. Seeds 10000 snippet embeddings (random Float32Array, 1536 dims)
+3. Runs 100 `vectorSearch()` calls
+4. Asserts p99 < 50 ms
+
+This gates the CI check on Phase 4 correctness and speed.