Files
trueref/docs/features/TRUEREF-0023.md

956 lines
44 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TRUEREF-0023 — libSQL Migration, Native Vector Search, Parallel Tag Indexing, and Performance Hardening
**Priority:** P1
**Status:** Draft
**Depends On:** TRUEREF-0001, TRUEREF-0022
**Blocks:**
---
## Overview
TrueRef currently uses `better-sqlite3` for all database access. This creates three compounding performance problems:
1. **Vector search does not scale.** `VectorSearch.vectorSearch()` loads the entire `snippet_embeddings` table for a repository into Node.js memory and computes cosine similarity in a JavaScript loop. A repository with 100k snippets at 1536 OpenAI dimensions allocates ~600 MB per query and ties up the worker thread for seconds before returning results.
2. **Missing composite indexes cause table scans on every query.** The schema defines FK columns used in every search and embedding filter, but declares zero composite or covering indexes on them. Every call to `searchSnippets`, `findSnippetIdsMissingEmbeddings`, and `cloneFromAncestor` performs full or near-full table scans.
3. **SQLite connection is under-configured.** Critical pragmas (`synchronous`, `cache_size`, `mmap_size`, `temp_store`) are absent, leaving significant I/O throughput on the table.
The solution is to replace `better-sqlite3` with `@libsql/better-sqlite3` — an embeddable, drop-in synchronous replacement that is a superset of the better-sqlite3 API and exposes libSQL's native vector index (`libsql_vector_idx`). Because the API is identical, no service layer or ORM code changes are needed beyond import statements and the vector search implementation.
Two additional structural improvements are delivered in the same feature:
4. **Per-repo job serialization is too coarse.** `WorkerPool` prevents any two jobs sharing the same `repositoryId` from running in parallel. This means indexing 200 tags of a single library is fully sequential — one tag at a time — even though different tags write to entirely disjoint row sets. The constraint should track `(repositoryId, versionId)` pairs instead.
5. **Write lock contention under parallel indexing.** When multiple parse workers flush parsed snippets simultaneously they all compete for the SQLite write lock, spending most of their time in `busy_timeout` back-off. A single dedicated write worker eliminates this: parse workers become pure CPU workers (crawl → parse → send batches over `postMessage`) and the write worker is the sole DB writer.
6. **Admin UI is unusable under load.** The job queue page has no status or repository filters, no worker status panel, no skeleton loading, uses blocking `alert()` / `confirm()` dialogs, and `IndexingProgress` still polls every 2 seconds instead of consuming the existing SSE stream.
---
## Goals
1. Replace `better-sqlite3` with `@libsql/better-sqlite3` with minimal code churn — import paths only.
2. Add a libSQL vector index on `snippet_embeddings` so that KNN queries execute inside SQLite instead of in a JavaScript loop.
3. Add the six composite and covering indexes required by the hot query paths.
4. Tune the SQLite pragma configuration for I/O performance.
5. Eliminate the leading cause of OOM risk during semantic search.
6. Keep a single embedded database file — no external server, no network.
7. Allow multiple tags of the same repository to index in parallel (unrelated version rows, no write conflict).
8. Eliminate write-lock contention between parallel parse workers by introducing a single dedicated write worker.
9. Rebuild the admin jobs page with full filtering (status, repository, free-text), a live worker status panel, skeleton loading on initial fetch, per-action inline spinners, non-blocking toast notifications, and SSE-driven real-time updates throughout.
---
## Non-Goals
- Migrating to the async `@libsql/client` package (HTTP/embedded-replica mode).
- Changing the Drizzle ORM adapter (`drizzle-orm/better-sqlite3` stays unchanged).
- Changing `drizzle.config.ts` dialect (`sqlite` is still correct for embedded libSQL).
- Adding hybrid/approximate indexing beyond the default HNSW strategy provided by `libsql_vector_idx`.
- Parallelizing embedding batches across providers (separate feature).
- Horizontally scaling across processes.
- Allowing more than one job for the exact same `(repositoryId, versionId)` pair to run concurrently (still serialized — duplicate detection in `JobQueue` is unchanged).
- A full admin authentication system (out of scope).
- Mobile-responsive redesign of the entire admin section (out of scope).
---
## Problem Detail
### 1. Vector Search — Full Table Scan in JavaScript
**File:** `src/lib/server/search/vector.search.ts`
```typescript
// Current: no LIMIT, loads ALL embeddings for repo into memory
const rows = this.db.prepare<unknown[], RawEmbeddingRow>(sql).all(...params);
const scored: VectorSearchResult[] = rows.map((row) => {
const embedding = new Float32Array(
row.embedding.buffer,
row.embedding.byteOffset,
row.embedding.byteLength / 4
);
return { snippetId: row.snippet_id, score: cosineSimilarity(queryEmbedding, embedding) };
});
return scored.sort((a, b) => b.score - a.score).slice(0, limit);
```
For a repo with N snippets and D dimensions, this allocates `N × D × 4` bytes per query. At N=100k and D=1536, that is ~600 MB allocated synchronously. The result is sorted entirely in JS before the top-k is returned. With a native vector index, SQLite returns only the top-k rows.
### 2. Missing Composite Indexes
The `snippets`, `documents`, and `snippet_embeddings` tables are queried with multi-column WHERE predicates in every hot path, but no composite indexes exist:
| Table | Filter columns | Used in |
| -------------------- | ----------------------------- | ---------------------------------------------- |
| `snippets` | `(repository_id, version_id)` | All search, diff, clone |
| `snippets` | `(repository_id, type)` | Type-filtered queries |
| `documents` | `(repository_id, version_id)` | Diff strategy, clone |
| `snippet_embeddings` | `(profile_id, snippet_id)` | `findSnippetIdsMissingEmbeddings` LEFT JOIN |
| `repositories` | `(state)` | `searchRepositories` WHERE `state = 'indexed'` |
| `indexing_jobs` | `(repository_id, status)` | Job status lookups |
Without these indexes, SQLite performs a B-tree scan of the primary key and filters rows in memory. On a 500k-row `snippets` table this is the dominant cost of every search.
### 4. Admin UI — Current Problems
**File:** `src/routes/admin/jobs/+page.svelte`, `src/lib/components/IndexingProgress.svelte`
| Problem | Location | Impact |
| -------------------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------ |
| `IndexingProgress` polls every 2 s via `setInterval` + `fetch` | `IndexingProgress.svelte` | Constant HTTP traffic; progress lags by up to 2 s |
| No status or repository filter controls | `admin/jobs/+page.svelte` | With 200 tag jobs, finding a specific one requires scrolling |
| No worker status panel | — (no endpoint exists) | Operator cannot see which workers are busy or idle |
| `alert()` for errors, `confirm()` for cancel | `admin/jobs/+page.svelte``showToast()` | Blocks the entire browser tab; unusable under parallel jobs |
| `actionInProgress` is a single string, not per-job | `admin/jobs/+page.svelte` | Pausing job A disables buttons on all other jobs |
| No skeleton loading — blank + spinner on first load | `admin/jobs/+page.svelte` | Layout shift; no structural preview while data loads |
| Hard-coded `limit=50` query, no pagination | `admin/jobs/+page.svelte:fetchJobs()` | Page truncates silently for large queues |
---
### 3. Under-configured SQLite Connection
**File:** `src/lib/server/db/client.ts` and `src/lib/server/db/index.ts`
Current pragmas:
```typescript
client.pragma('journal_mode = WAL');
client.pragma('foreign_keys = ON');
client.pragma('busy_timeout = 5000');
```
Missing:
- `synchronous = NORMAL` — halves fsync overhead vs the default FULL; safe with WAL
- `cache_size = -65536` — 64 MB page cache; default is 2 MB
- `temp_store = MEMORY` — temp tables and sort spills stay in RAM
- `mmap_size = 268435456` — 256 MB memory-mapped read path; bypasses system call overhead for reads
- `wal_autocheckpoint = 1000` — more frequent checkpoints prevent WAL growth
---
## Architecture
### Drop-In Replacement: `@libsql/better-sqlite3`
`@libsql/better-sqlite3` is published by Turso and implemented as a Node.js native addon wrapping the libSQL embedded engine. The exported class is API-compatible with `better-sqlite3`:
```typescript
// before
import Database from 'better-sqlite3';
const db = new Database('/path/to/file.db');
db.pragma('journal_mode = WAL');
const rows = db.prepare('SELECT ...').all(...params);
// after — identical code
import Database from '@libsql/better-sqlite3';
const db = new Database('/path/to/file.db');
db.pragma('journal_mode = WAL');
const rows = db.prepare('SELECT ...').all(...params);
```
All of the following continue to work unchanged:
- `drizzle-orm/better-sqlite3` adapter and `migrate` helper
- `drizzle-kit` with `dialect: 'sqlite'`
- Prepared statements, transactions, WAL pragmas, foreign keys
- Worker thread per-thread connections (`worker-entry.ts`, `embed-worker-entry.ts`)
- All `type Database from 'better-sqlite3'` type imports (replaced in lock-step)
### Vector Index Design
libSQL provides `libsql_vector_idx()` — a virtual index type stored in a shadow table alongside the main table. Once indexed, KNN queries use a SQL `vector_top_k()` function:
```sql
-- KNN: return top-k snippet IDs closest to the query vector
SELECT snippet_id
FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?)
```
`vector_from_float32(blob)` accepts the same raw little-endian Float32 bytes currently stored in the `embedding` blob column. **No data migration is needed** — the existing blob column can be re-indexed with `libsql_vector_idx` pointing at the bytes-stored column.
The index strategy:
1. Add a generated `vec_embedding` column of type `F32_BLOB(dimensions)` to `snippet_embeddings`, populated from the existing `embedding` blob via a migration trigger.
2. Create the vector index: `CREATE INDEX idx_snippet_embeddings_vec ON snippet_embeddings(vec_embedding) USING libsql_vector_idx(vec_embedding)`.
3. Rewrite `VectorSearch.vectorSearch()` to use `vector_top_k()` with a two-step join instead of the in-memory loop.
4. Update `EmbeddingService.embedSnippets()` to write `vec_embedding` on insert.
Dimensions are profile-specific. Because the index is per-column, a separate index is needed per embedding dimensionality. For v1, a single index covering the default profile's dimensions is sufficient; multi-profile KNN can be handled with a `WHERE profile_id = ?` pre-filter on the vector_top_k results.
### Updated Vector Search Query
```typescript
vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
// Encode query vector as raw bytes (same format as stored blobs)
const queryBytes = Buffer.from(queryEmbedding.buffer);
// Use libSQL vector_top_k for ANN — returns ordered (rowid, distance) pairs
let sql = `
SELECT se.snippet_id,
vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS score
FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
JOIN snippet_embeddings se ON se.rowid = knn.id
JOIN snippets s ON s.id = se.snippet_id
WHERE s.repository_id = ?
AND se.profile_id = ?
`;
const params: unknown[] = [queryBytes, queryBytes, limit * 4, repositoryId, profileId];
if (versionId) {
sql += ' AND s.version_id = ?';
params.push(versionId);
}
sql += ' ORDER BY score ASC LIMIT ?';
params.push(limit);
return this.db
.prepare<unknown[], { snippet_id: string; score: number }>(sql)
.all(...params)
.map((row) => ({ snippetId: row.snippet_id, score: 1 - row.score }));
}
```
`vector_distance_cos` returns distance (0 = identical), so `1 - distance` gives a similarity score in [0, 1] matching the existing `VectorSearchResult.score` contract.
---
## Implementation Plan
### Phase 1 — Package Swap (no logic changes)
**Files touched:** `package.json`, all `.ts` files that import `better-sqlite3`
1. In `package.json`:
- Remove `"better-sqlite3": "^12.6.2"` from `dependencies`
- Add `"@libsql/better-sqlite3": "^0.4.0"` to `dependencies`
- Remove `"@types/better-sqlite3": "^7.6.13"` from `devDependencies`
- `@libsql/better-sqlite3` ships its own TypeScript declarations
2. Replace all import statements (35 occurrences across 19 files):
| Old import | New import |
| --------------------------------------------------------------- | ---------------------------------------------------- |
| `import Database from 'better-sqlite3'` | `import Database from '@libsql/better-sqlite3'` |
| `import type Database from 'better-sqlite3'` | `import type Database from '@libsql/better-sqlite3'` |
| `import { drizzle } from 'drizzle-orm/better-sqlite3'` | unchanged |
| `import { migrate } from 'drizzle-orm/better-sqlite3/migrator'` | unchanged |
Affected production files:
- `src/lib/server/db/index.ts`
- `src/lib/server/db/client.ts`
- `src/lib/server/embeddings/embedding.service.ts`
- `src/lib/server/pipeline/indexing.pipeline.ts`
- `src/lib/server/pipeline/job-queue.ts`
- `src/lib/server/pipeline/startup.ts`
- `src/lib/server/pipeline/worker-entry.ts`
- `src/lib/server/pipeline/embed-worker-entry.ts`
- `src/lib/server/pipeline/differential-strategy.ts`
- `src/lib/server/search/vector.search.ts`
- `src/lib/server/search/hybrid.search.service.ts`
- `src/lib/server/search/search.service.ts`
- `src/lib/server/services/repository.service.ts`
- `src/lib/server/services/version.service.ts`
- `src/lib/server/services/embedding-settings.service.ts`
Affected test files (same mechanical replacement):
- `src/routes/api/v1/api-contract.integration.test.ts`
- `src/routes/api/v1/sse-and-settings.integration.test.ts`
- `src/routes/settings/page.server.test.ts`
- `src/lib/server/db/schema.test.ts`
- `src/lib/server/embeddings/embedding.service.test.ts`
- `src/lib/server/pipeline/indexing.pipeline.test.ts`
- `src/lib/server/pipeline/differential-strategy.test.ts`
- `src/lib/server/search/search.service.test.ts`
- `src/lib/server/search/hybrid.search.service.test.ts`
- `src/lib/server/services/repository.service.test.ts`
- `src/lib/server/services/version.service.test.ts`
- `src/routes/api/v1/settings/embedding/server.test.ts`
- `src/routes/api/v1/libs/[id]/index/server.test.ts`
- `src/routes/api/v1/libs/[id]/versions/discover/server.test.ts`
3. Run all tests — they should pass with zero logic changes: `npm test`
### Phase 2 — Pragma Hardening
**Files touched:** `src/lib/server/db/client.ts`, `src/lib/server/db/index.ts`
Add the following pragmas to both connection factories (raw client and `initializeDatabase()`):
```typescript
client.pragma('synchronous = NORMAL');
client.pragma('cache_size = -65536'); // 64 MB
client.pragma('temp_store = MEMORY');
client.pragma('mmap_size = 268435456'); // 256 MB
client.pragma('wal_autocheckpoint = 1000');
```
Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) open their own connections — apply the same pragmas there.
### Phase 3 — Composite Indexes (Drizzle migration)
**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL file
Add indexes in `schema.ts` using Drizzle's `index()` helper:
```typescript
// snippets table
export const snippets = sqliteTable(
'snippets',
{
/* unchanged */
},
(t) => [
index('idx_snippets_repo_version').on(t.repositoryId, t.versionId),
index('idx_snippets_repo_type').on(t.repositoryId, t.type)
]
);
// documents table
export const documents = sqliteTable(
'documents',
{
/* unchanged */
},
(t) => [index('idx_documents_repo_version').on(t.repositoryId, t.versionId)]
);
// snippet_embeddings table
export const snippetEmbeddings = sqliteTable(
'snippet_embeddings',
{
/* unchanged */
},
(table) => [
primaryKey({ columns: [table.snippetId, table.profileId] }), // unchanged
index('idx_embeddings_profile').on(table.profileId, table.snippetId)
]
);
// repositories table
export const repositories = sqliteTable(
'repositories',
{
/* unchanged */
},
(t) => [index('idx_repositories_state').on(t.state)]
);
// indexing_jobs table
export const indexingJobs = sqliteTable(
'indexing_jobs',
{
/* unchanged */
},
(t) => [index('idx_jobs_repo_status').on(t.repositoryId, t.status)]
);
```
Generate and apply migration: `npm run db:generate && npm run db:migrate`
### Phase 4 — Vector Column and Index (Drizzle migration)
**Files touched:** `src/lib/server/db/schema.ts`, new migration SQL, `src/lib/server/search/vector.search.ts`, `src/lib/server/embeddings/embedding.service.ts`
#### 4a. Schema: add `vec_embedding` column
Add `vec_embedding` to `snippet_embeddings`. Drizzle does not have a `F32_BLOB` column type helper; use a raw SQL column:
```typescript
import { sql } from 'drizzle-orm';
import { customType } from 'drizzle-orm/sqlite-core';
const f32Blob = (name: string, dimensions: number) =>
customType<{ data: Buffer }>({
dataType() {
return `F32_BLOB(${dimensions})`;
}
})(name);
export const snippetEmbeddings = sqliteTable(
'snippet_embeddings',
{
snippetId: text('snippet_id')
.notNull()
.references(() => snippets.id, { onDelete: 'cascade' }),
profileId: text('profile_id')
.notNull()
.references(() => embeddingProfiles.id, { onDelete: 'cascade' }),
model: text('model').notNull(),
dimensions: integer('dimensions').notNull(),
embedding: blob('embedding').notNull(), // existing blob — kept for backward compat
vecEmbedding: f32Blob('vec_embedding', 1536), // libSQL vector column (nullable during migration fill)
createdAt: integer('created_at').notNull()
},
(table) => [
primaryKey({ columns: [table.snippetId, table.profileId] }),
index('idx_embeddings_profile').on(table.profileId, table.snippetId)
]
);
```
Because dimensionality is fixed per model, `F32_BLOB(1536)` covers OpenAI `text-embedding-3-small/large`. A follow-up can parameterize this per profile.
#### 4b. Migration SQL: populate `vec_embedding` from existing `embedding` blob and create the vector index
The vector index cannot be expressed in SQL DDL portable across Drizzle — it must be applied in the FTS-style custom SQL file (`src/lib/server/db/fts.sql` or an equivalent `vectors.sql`):
```sql
-- Backfill vec_embedding from existing raw blob data
UPDATE snippet_embeddings
SET vec_embedding = vector_from_float32(embedding)
WHERE vec_embedding IS NULL AND embedding IS NOT NULL;
-- Create the HNSW vector index (libSQL extension syntax)
CREATE INDEX IF NOT EXISTS idx_snippet_embeddings_vec
ON snippet_embeddings(vec_embedding)
USING libsql_vector_idx(vec_embedding, 'metric=cosine', 'compress_neighbors=float8', 'max_neighbors=20');
```
Add a call to this SQL in `initializeDatabase()` alongside the existing `fts.sql` execution:
```typescript
const vectorSql = readFileSync(join(__dirname, 'vectors.sql'), 'utf-8');
client.exec(vectorSql);
```
#### 4c. Update `EmbeddingService.embedSnippets()`
When inserting a new embedding, write both the blob and the vec column:
```typescript
const insert = this.db.prepare<[string, string, string, number, Buffer, Buffer]>(`
INSERT OR REPLACE INTO snippet_embeddings
(snippet_id, profile_id, model, dimensions, embedding, vec_embedding, created_at)
VALUES (?, ?, ?, ?, ?, vector_from_float32(?), unixepoch())
`);
// inside the transaction:
insert.run(
snippet.id,
this.profileId,
embedding.model,
embedding.dimensions,
embeddingBuffer,
embeddingBuffer // same bytes — vector_from_float32() interprets them
);
```
#### 4d. Rewrite `VectorSearch.vectorSearch()`
Replace the full-scan JS loop with `vector_top_k()`:
```typescript
vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;
const queryBytes = Buffer.from(queryEmbedding.buffer);
const candidatePool = limit * 4; // over-fetch for post-filter
let sql = `
SELECT se.snippet_id,
vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS distance
FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
JOIN snippet_embeddings se ON se.rowid = knn.id
JOIN snippets s ON s.id = se.snippet_id
WHERE s.repository_id = ?
AND se.profile_id = ?
`;
const params: unknown[] = [queryBytes, queryBytes, candidatePool, repositoryId, profileId];
if (versionId) {
sql += ' AND s.version_id = ?';
params.push(versionId);
}
sql += ' ORDER BY distance ASC LIMIT ?';
params.push(limit);
return this.db
.prepare<unknown[], { snippet_id: string; distance: number }>(sql)
.all(...params)
.map((row) => ({ snippetId: row.snippet_id, score: 1 - row.distance }));
}
```
The `score` contract is preserved (1 = identical, 0 = orthogonal). The `cosineSimilarity` helper function is no longer called at runtime but can be kept for unit tests.
### Phase 5 — Per-Job Serialization Key Fix
**Files touched:** `src/lib/server/pipeline/worker-pool.ts`
The current serialization guard uses a bare `repositoryId`:
```typescript
// current
private runningRepoIds = new Set<string>();
// blocks any job whose repositoryId is already in the set
const jobIdx = this.jobQueue.findIndex((j) => !this.runningRepoIds.has(j.repositoryId));
```
Different tags of the same repository write to completely disjoint rows (`version_id`-partitioned documents, snippets, and embeddings). The only genuine conflict is two jobs for the same `(repositoryId, versionId)` pair, which `JobQueue.enqueue()` already prevents via the `status IN ('queued', 'running')` deduplication check.
Change the guard to key on the compound pair:
```typescript
// replace Set<string> with Set<string> keyed on compound pair
private runningJobKeys = new Set<string>();
private jobKey(repositoryId: string, versionId?: string | null): string {
return `${repositoryId}|${versionId ?? ''}`;
}
```
Update all four sites that read/write `runningRepoIds`:
| Location | Old | New |
| ------------------------------------ | ----------------------------------------------------- | ---------------------------------------------------------------------------------------- |
| `dispatch()` find | `!this.runningRepoIds.has(j.repositoryId)` | `!this.runningJobKeys.has(this.jobKey(j.repositoryId, j.versionId))` |
| `dispatch()` add | `this.runningRepoIds.add(job.repositoryId)` | `this.runningJobKeys.add(this.jobKey(job.repositoryId, job.versionId))` |
| `onWorkerMessage` done/failed delete | `this.runningRepoIds.delete(runningJob.repositoryId)` | `this.runningJobKeys.delete(this.jobKey(runningJob.repositoryId, runningJob.versionId))` |
| `onWorkerExit` delete | same | same |
The `QueuedJob` and `RunningJob` interfaces already carry `versionId` — no type changes needed.
The only serialized case that remains is `versionId = null` (default-branch re-index) paired with itself, which maps to the stable key `"repositoryId|"` — correctly deduplicated.
---
### Phase 6 — Dedicated Write Worker (Single-Writer Pattern)
**Files touched:** `src/lib/server/pipeline/worker-types.ts`, `src/lib/server/pipeline/write-worker-entry.ts` (new), `src/lib/server/pipeline/worker-entry.ts`, `src/lib/server/pipeline/worker-pool.ts`
#### Motivation
With Phase 5 in place, N tags of the same library can index in parallel. Each parse worker currently opens its own DB connection and holds the write lock while storing parsed snippets. Under N concurrent writers, each worker spends the majority of its wall-clock time waiting in `busy_timeout` back-off. The fix is the single-writer pattern: one dedicated write worker owns the only writable DB connection; parse workers become stateless CPU workers that send write batches over `postMessage`.
```
Parse Worker 1 ──┐ WriteRequest (docs[], snippets[]) ┌── WriteAck
Parse Worker 2 ──┼─────────────────────────────────────► Write Worker (sole DB writer)
Parse Worker N ──┘ └── single better-sqlite3 connection
```
#### New message types (`worker-types.ts`)
```typescript
export interface WriteRequest {
type: 'write';
jobId: string;
documents: SerializedDocument[];
snippets: SerializedSnippet[];
}
export interface WriteAck {
type: 'write_ack';
jobId: string;
documentCount: number;
snippetCount: number;
}
export interface WriteError {
type: 'write_error';
jobId: string;
error: string;
}
// SerializedDocument / SerializedSnippet mirror the DB column shapes
// (plain objects, safe to transfer via structured clone)
```
#### Write worker (`write-worker-entry.ts`)
The write worker:
- Opens its own `Database` connection (WAL mode, all pragmas from Phase 2)
- Listens for `WriteRequest` messages
- Wraps each batch in a single transaction
- Posts `WriteAck` or `WriteError` back to the parent, which forwards the ack to the originating parse worker by `jobId`
```typescript
import Database from '@libsql/better-sqlite3';
import { workerData, parentPort } from 'node:worker_threads';
import type { WriteRequest, WriteAck, WriteError } from './worker-types.js';
const db = new Database((workerData as WorkerInitData).dbPath);
db.pragma('journal_mode = WAL');
db.pragma('synchronous = NORMAL');
db.pragma('cache_size = -65536');
db.pragma('foreign_keys = ON');
const insertDoc = db.prepare(`INSERT OR REPLACE INTO documents (...) VALUES (...)`);
const insertSnippet = db.prepare(`INSERT OR REPLACE INTO snippets (...) VALUES (...)`);
const writeBatch = db.transaction((req: WriteRequest) => {
for (const doc of req.documents) insertDoc.run(doc);
for (const snip of req.snippets) insertSnippet.run(snip);
});
parentPort!.on('message', (req: WriteRequest) => {
try {
writeBatch(req);
const ack: WriteAck = {
type: 'write_ack',
jobId: req.jobId,
documentCount: req.documents.length,
snippetCount: req.snippets.length
};
parentPort!.postMessage(ack);
} catch (err) {
const fail: WriteError = { type: 'write_error', jobId: req.jobId, error: String(err) };
parentPort!.postMessage(fail);
}
});
```
#### Parse worker changes (`worker-entry.ts`)
Parse workers lose their DB connection. `IndexingPipeline` receives a `sendWrite` callback instead of a `db` instance. After parsing each file batch, the worker calls `sendWrite({ type: 'write', jobId, documents, snippets })` and awaits the `WriteAck` before continuing. This keeps back-pressure: a slow write worker naturally throttles the parse workers without additional semaphores.
#### WorkerPool changes
- Spawn one write worker at startup (always, regardless of embedding config)
- Route incoming `write_ack` / `write_error` messages to the correct waiting parse worker via a `Map<jobId, resolve>` promise registry
- The write worker is separate from the embed worker — embed writes (`snippet_embeddings`) can still go through the write worker by adding an `EmbedWriteRequest` message type, or remain in the embed worker since embedding runs after parsing completes (no lock contention with active parse jobs)
#### Conflict analysis with Phase 5
Phases 5 and 6 compose cleanly:
- Phase 5 allows multiple `(repo, versionId)` jobs to run concurrently
- Phase 6 ensures all those concurrent jobs share a single write path — contention is eliminated by design
- The write worker is stateless with respect to job identity; it just executes batches in arrival order within a FIFO message queue (Node.js `postMessage` is ordered)
- The embed worker remains a separate process (it runs after parse completes, so it never overlaps with active parse writes for the same job)
---
### Phase 7 — Admin UI Overhaul
**Files touched:**
- `src/routes/admin/jobs/+page.svelte` — rebuilt
- `src/routes/api/v1/workers/+server.ts` — new endpoint
- `src/lib/components/admin/JobStatusBadge.svelte` — extend with spinner variant
- `src/lib/components/admin/JobSkeleton.svelte` — new
- `src/lib/components/admin/WorkerStatusPanel.svelte` — new
- `src/lib/components/admin/Toast.svelte` — new
- `src/lib/components/IndexingProgress.svelte` — switch to SSE
#### 7a. New API endpoint: `GET /api/v1/workers`
The `WorkerPool` singleton tracks running jobs in `runningJobs: Map<Worker, RunningJob>` and idle workers in `idleWorkers: Worker[]`. Expose this state as a lightweight REST snapshot:
```typescript
// GET /api/v1/workers
// Response shape:
interface WorkersResponse {
concurrency: number; // configured max workers
active: number; // workers with a running job
idle: number; // workers waiting for work
workers: WorkerStatus[]; // one entry per spawned parse worker
}
interface WorkerStatus {
index: number; // worker slot (0-based)
state: 'idle' | 'running'; // current state
jobId: string | null; // null when idle
repositoryId: string | null;
versionId: string | null;
}
```
The route handler calls `getPool().getStatus()` — add a `getStatus(): WorkersResponse` method to `WorkerPool` that reads `runningJobs` and `idleWorkers` without any DB call. This is read-only and runs on the main thread.
The SSE stream at `/api/v1/jobs/stream` should emit a new `worker-status` event type whenever a worker transitions idle ↔ running (on `dispatch()` and job completion). This allows the worker panel to update in real-time without polling the REST endpoint.
#### 7b. `GET /api/v1/jobs` — add `repositoryId` free-text and multi-status filter
The existing endpoint already accepts `repositoryId` (exact match) and `status` (single value). Extend:
- `repositoryId` to also support prefix match (e.g. `?repositoryId=/facebook` returns all `/facebook/*` repos)
- `status` to accept comma-separated values: `?status=queued,running`
- `page` and `pageSize` query params (default pageSize=50, max 200) in addition to `limit` for backwards compat
Return `{ jobs, total, page, pageSize }` with `total` always reflecting the unfiltered-by-page count.
#### 7c. New component: `JobSkeleton.svelte`
A set of skeleton rows matching the job table structure. Shown during the initial fetch before any data arrives. Uses Tailwind `animate-pulse`:
```svelte
<!-- renders N skeleton rows -->
<script lang="ts">
let { rows = 5 }: { rows?: number } = $props();
</script>
{#each Array(rows) as _, i (i)}
<tr>
<td class="px-6 py-4">
<div class="h-4 w-48 animate-pulse rounded bg-gray-200"></div>
<div class="mt-1 h-3 w-24 animate-pulse rounded bg-gray-100"></div>
</td>
<td class="px-6 py-4">
<div class="h-5 w-16 animate-pulse rounded-full bg-gray-200"></div>
</td>
<td class="px-6 py-4">
<div class="h-4 w-20 animate-pulse rounded bg-gray-200"></div>
</td>
<td class="px-6 py-4">
<div class="h-2 w-32 animate-pulse rounded-full bg-gray-200"></div>
</td>
<td class="px-6 py-4">
<div class="h-4 w-28 animate-pulse rounded bg-gray-200"></div>
</td>
<td class="px-6 py-4 text-right">
<div class="ml-auto h-7 w-20 animate-pulse rounded bg-gray-200"></div>
</td>
</tr>
{/each}
```
#### 7d. New component: `Toast.svelte`
Replaces all `alert()` / `console.log()` calls in the jobs page. Renders a fixed-position stack in the bottom-right corner. Each toast auto-dismisses after 4 seconds and can be manually closed:
```svelte
<!-- Usage: bind a toasts array and call push({ message, type }) -->
<script lang="ts">
export interface ToastItem {
id: string;
message: string;
type: 'success' | 'error' | 'info';
}
let { toasts = $bindable([]) }: { toasts: ToastItem[] } = $props();
function dismiss(id: string) {
toasts = toasts.filter((t) => t.id !== id);
}
</script>
<div class="fixed right-4 bottom-4 z-50 flex flex-col gap-2">
{#each toasts as toast (toast.id)}
<!-- color by type, close button, auto-dismiss via onmount timer -->
{/each}
</div>
```
The jobs page replaces `showToast()` with pushing onto the bound `toasts` array. The `confirm()` for cancel is replaced with an inline confirmation state per job (`pendingCancelId`) that shows "Confirm cancel?" / "Yes" / "No" buttons inside the row.
#### 7e. New component: `WorkerStatusPanel.svelte`
A compact panel displayed above the job table showing the worker pool health. Subscribes to the `worker-status` SSE events and falls back to polling `GET /api/v1/workers` every 5 s on SSE error:
```
┌─────────────────────────────────────────────────────────┐
│ Workers [2 / 4 active] ████░░░░ 50% │
│ Worker 0 ● running /facebook/react / v18.3.0 │
│ Worker 1 ● running /facebook/react / v17.0.2 │
│ Worker 2 ○ idle │
│ Worker 3 ○ idle │
└─────────────────────────────────────────────────────────┘
```
Each worker row shows: slot index, status dot (animated green pulse for running), repository ID, version tag, and a link to the job row in the table below.
#### 7f. Filter bar on the jobs page
Add a filter strip between the page header and the table:
```
[ Repository: _______________ ] [ Status: ▾ all ] [ 🔍 Apply ] [ ↺ Reset ]
```
- **Repository field**: free-text input, matches `repositoryId` prefix (e.g. `/facebook` shows all `/facebook/*`)
- **Status dropdown**: multi-select checkboxes for `queued`, `running`, `paused`, `cancelled`, `done`, `failed`; default = all
- Filters are applied client-side against the loaded `jobs` array for instant feedback, and also re-fetched from the API on Apply to get the correct total count
- Filter state is mirrored to URL search params (`?repo=...&status=...`) so the view is bookmarkable and survives refresh
#### 7g. Per-job action spinner and disabled state
Replace the single `actionInProgress: string | null` with a `Map<string, 'pausing' | 'resuming' | 'cancelling'>`:
```typescript
let actionInProgress = $state(new Map<string, 'pausing' | 'resuming' | 'cancelling'>());
```
Each action button shows an inline spinner (small `animate-spin` circle) and is disabled only for that row. Other rows remain fully interactive during the action. On completion the entry is deleted from the map.
#### 7h. `IndexingProgress.svelte` — switch from polling to SSE
The component currently uses `setInterval + fetch` at 2 s. Replace with the per-job SSE stream already available at `/api/v1/jobs/{id}/stream`:
```typescript
// replace the $effect body
$effect(() => {
job = null;
const es = new EventSource(`/api/v1/jobs/${jobId}/stream`);
es.addEventListener('job-progress', (event) => {
const data = JSON.parse(event.data);
job = { ...job, ...data };
});
es.addEventListener('job-done', () => {
void fetch(`/api/v1/jobs/${jobId}`)
.then((r) => r.json())
.then((d) => {
job = d.job;
oncomplete?.();
});
es.close();
});
es.addEventListener('job-failed', (event) => {
const data = JSON.parse(event.data);
job = { ...job, status: 'failed', error: data.error };
oncomplete?.();
es.close();
});
es.onerror = () => {
// on SSE failure fall back to a single fetch to get current state
es.close();
void fetch(`/api/v1/jobs/${jobId}`)
.then((r) => r.json())
.then((d) => {
job = d.job;
});
};
return () => es.close();
});
```
This reduces network traffic from 1 request/2 s to zero requests during active indexing — updates arrive as server-push events.
#### 7i. Pagination on the jobs page
Replace the hard-coded `?limit=50` fetch with paginated requests:
```typescript
let currentPage = $state(1);
const PAGE_SIZE = 50;
async function fetchJobs() {
const params = new URLSearchParams({
page: String(currentPage),
pageSize: String(PAGE_SIZE),
...(filterRepo ? { repositoryId: filterRepo } : {}),
...(filterStatuses.length ? { status: filterStatuses.join(',') } : {})
});
const data = await fetch(`/api/v1/jobs?${params}`).then((r) => r.json());
jobs = data.jobs;
total = data.total;
}
```
Render a simple `« Prev Page N of M Next »` control below the table, hidden when `total <= PAGE_SIZE`.
---
## Acceptance Criteria
- [ ] `npm install` with `@libsql/better-sqlite3` succeeds; `better-sqlite3` is absent from `node_modules`
- [ ] All existing unit and integration tests pass after Phase 1 import swap
- [ ] `npm run db:migrate` applies the composite index migration cleanly against an existing database
- [ ] `npm run db:migrate` applies the vector column migration cleanly; `sql> SELECT vec_embedding FROM snippet_embeddings LIMIT 1` returns a non-NULL value for any previously-embedded snippet
- [ ] `GET /api/v1/context?libraryId=...&query=...` with a semantic-mode or hybrid-mode request returns results in ≤ 200 ms on a repository with 50k+ snippets (vs previous multi-second response)
- [ ] Memory profiled during a /context request shows no allocation spike proportional to repository size
- [ ] `EXPLAIN QUERY PLAN` on the `snippets` search query shows `SCAN snippets USING INDEX idx_snippets_repo_version` instead of `SCAN snippets`
- [ ] Worker threads (`worker-entry.ts`, `embed-worker-entry.ts`) start and complete an indexing job successfully after the package swap
- [ ] `drizzle-kit studio` connects and browses the migrated database
- [ ] Re-indexing a repository after the migration correctly populates `vec_embedding` on all new snippets
- [ ] `cosineSimilarity` unit tests still pass (function is kept)
- [ ] Starting two indexing jobs for different tags of the same repository simultaneously results in both jobs reaching `running` state concurrently (not one waiting for the other)
- [ ] Starting two indexing jobs for the **same** `(repositoryId, versionId)` pair returns the existing job (deduplication unchanged)
- [ ] With 4 parse workers and 4 concurrent tag jobs, zero `SQLITE_BUSY` errors appear in logs
- [ ] Write worker is present in the process list during active indexing (`worker_threads` inspector shows `write-worker-entry`)
- [ ] A `WriteError` from the write worker marks the originating job as `failed` with the error message propagated to the SSE stream
- [ ] `GET /api/v1/workers` returns a `WorkersResponse` JSON object with correct `active`, `idle`, and `workers[]` fields while jobs are in-flight
- [ ] The `worker-status` SSE event is emitted by `/api/v1/jobs/stream` whenever a worker transitions state
- [ ] The admin jobs page shows skeleton rows (not a blank screen) during the initial `fetchJobs()` call
- [ ] No `alert()` or `confirm()` calls exist in `admin/jobs/+page.svelte` after this change; all notifications go through `Toast.svelte`
- [ ] Pausing job A while job B is also in progress does not disable job B's action buttons
- [ ] The status filter multi-select correctly restricts the visible job list; the URL updates to reflect the filter state
- [ ] The repository prefix filter `?repositoryId=/facebook` returns all jobs whose `repositoryId` starts with `/facebook`
- [ ] Paginating past page 1 fetches the next batch from the API, not from the client-side array
- [ ] `IndexingProgress.svelte` has no `setInterval` call; it uses `EventSource` for progress updates
- [ ] The `WorkerStatusPanel` shows the correct number of running workers live during a multi-tag indexing run
- [ ] Refreshing the jobs page with `?repo=/facebook/react&status=running` pre-populates the filters and fetches with those params
---
## Migration Safety
### Backward Compatibility
The `embedding` blob column is kept. The `vec_embedding` column is nullable during the backfill window and becomes populated as:
1. The `UPDATE` in `vectors.sql` fills all existing rows on startup
2. New embeddings populate it at insert time
If `vec_embedding IS NULL` for a row (e.g., a row inserted before the migration runs), the vector index silently omits that row from results. The fallback in `HybridSearchService` to FTS-only mode still applies when no embeddings exist, so degraded-but-correct behavior is preserved.
### Rollback
Rollback before Phase 4 (vector column): remove `@libsql/better-sqlite3`, restore `better-sqlite3`, restore imports. No schema changes have been made.
Rollback after Phase 4: schema now has `vec_embedding` column. Drop the column with a migration reversal and restore imports. The `embedding` blob is intact throughout — no data loss.
### SQLite File Compatibility
libSQL embedded mode reads and writes standard SQLite 3 files. The WAL file, page size, and encoding are unchanged. An existing production database opened with `@libsql/better-sqlite3` is fully readable and writable. The vector index is stored in a shadow table `idx_snippet_embeddings_vec_shadow` which better-sqlite3 would ignore if rolled back (it is a regular table with a special name).
---
## Dependencies
| Package | Action | Reason |
| ------------------------ | ----------------------------- | ----------------------------------------------- |
| `better-sqlite3` | Remove from `dependencies` | Replaced |
| `@types/better-sqlite3` | Remove from `devDependencies` | `@libsql/better-sqlite3` ships own types |
| `@libsql/better-sqlite3` | Add to `dependencies` | Drop-in libSQL node addon |
| `drizzle-orm` | No change | `better-sqlite3` adapter works unchanged |
| `drizzle-kit` | No change | `dialect: 'sqlite'` correct for embedded libSQL |
No new runtime dependencies beyond the package replacement.
---
## Testing Strategy
### Unit Tests
- `src/lib/server/search/vector.search.ts`: add test asserting KNN results are correct for a seeded 3-vector table; verify memory is not proportional to table size (mock `db.prepare` to assert no unbounded `.all()` is called)
- `src/lib/server/embeddings/embedding.service.ts`: existing tests cover insert round-trips; verify `vec_embedding` column is non-NULL after `embedSnippets()`
### Integration Tests
- `api-contract.integration.test.ts`: existing tests already use `new Database(':memory:')` — these continue to work with `@libsql/better-sqlite3` because the in-memory path is identical
- Add one test to `api-contract.integration.test.ts`: seed a repository + multiple embeddings, call `/api/v1/context` in semantic mode, assert non-empty results and response time < 500ms on in-memory DB
### UI Tests
- `src/routes/admin/jobs/+page.svelte`: add Vitest browser tests (Playwright) verifying:
- Skeleton rows appear before the first fetch resolves (mock `fetch` to delay 200 ms)
- Status filter restricts displayed rows; URL param updates
- Pausing job A leaves job B's buttons enabled
- Toast appears and auto-dismisses on successful pause
- Cancel confirm flow shows inline confirmation, not `window.confirm`
- `src/lib/components/IndexingProgress.svelte`: unit test that no `setInterval` is created; verify `EventSource` is opened with the correct URL
### Performance Regression Gate
Add a benchmark script `scripts/bench-vector-search.mjs` that:
1. Creates an in-memory libSQL database
2. Seeds 10000 snippet embeddings (random Float32Array, 1536 dims)
3. Runs 100 `vectorSearch()` calls
4. Asserts p99 < 50 ms
This gates the CI check on Phase 4 correctness and speed.