mozempk/trueref

Fork 0

Files

Giancarmine Salucci f86be4106b TRUEREF-0023 rewrite indexing pipeline - parallel reads - serialized writes

2026-04-02 09:49:38 +02:00

44 KiB

Raw Blame History

TRUEREF-0023 — libSQL Migration, Native Vector Search, Parallel Tag Indexing, and Performance Hardening

Priority: P1 Status: Draft Depends On: TRUEREF-0001, TRUEREF-0022 Blocks: —

Overview

TrueRef currently uses better-sqlite3 for all database access. This creates three compounding performance problems:

Vector search does not scale. VectorSearch.vectorSearch() loads the entire snippet_embeddings table for a repository into Node.js memory and computes cosine similarity in a JavaScript loop. A repository with 100k snippets at 1536 OpenAI dimensions allocates ~600 MB per query and ties up the worker thread for seconds before returning results.
Missing composite indexes cause table scans on every query. The schema defines FK columns used in every search and embedding filter, but declares zero composite or covering indexes on them. Every call to searchSnippets, findSnippetIdsMissingEmbeddings, and cloneFromAncestor performs full or near-full table scans.
SQLite connection is under-configured. Critical pragmas (synchronous, cache_size, mmap_size, temp_store) are absent, leaving significant I/O throughput on the table.

The solution is to replace better-sqlite3 with @libsql/better-sqlite3 — an embeddable, drop-in synchronous replacement that is a superset of the better-sqlite3 API and exposes libSQL's native vector index (libsql_vector_idx). Because the API is identical, no service layer or ORM code changes are needed beyond import statements and the vector search implementation.

Two additional structural improvements are delivered in the same feature:

Per-repo job serialization is too coarse. WorkerPool prevents any two jobs sharing the same repositoryId from running in parallel. This means indexing 200 tags of a single library is fully sequential — one tag at a time — even though different tags write to entirely disjoint row sets. The constraint should track (repositoryId, versionId) pairs instead.
Write lock contention under parallel indexing. When multiple parse workers flush parsed snippets simultaneously they all compete for the SQLite write lock, spending most of their time in busy_timeout back-off. A single dedicated write worker eliminates this: parse workers become pure CPU workers (crawl → parse → send batches over postMessage) and the write worker is the sole DB writer.
Admin UI is unusable under load. The job queue page has no status or repository filters, no worker status panel, no skeleton loading, uses blocking alert() / confirm() dialogs, and IndexingProgress still polls every 2 seconds instead of consuming the existing SSE stream.

Goals

Replace better-sqlite3 with @libsql/better-sqlite3 with minimal code churn — import paths only.
Add a libSQL vector index on snippet_embeddings so that KNN queries execute inside SQLite instead of in a JavaScript loop.
Add the six composite and covering indexes required by the hot query paths.
Tune the SQLite pragma configuration for I/O performance.
Eliminate the leading cause of OOM risk during semantic search.
Keep a single embedded database file — no external server, no network.
Allow multiple tags of the same repository to index in parallel (unrelated version rows, no write conflict).
Eliminate write-lock contention between parallel parse workers by introducing a single dedicated write worker.
Rebuild the admin jobs page with full filtering (status, repository, free-text), a live worker status panel, skeleton loading on initial fetch, per-action inline spinners, non-blocking toast notifications, and SSE-driven real-time updates throughout.

Non-Goals

Migrating to the async @libsql/client package (HTTP/embedded-replica mode).
Changing the Drizzle ORM adapter (drizzle-orm/better-sqlite3 stays unchanged).
Changing drizzle.config.ts dialect (sqlite is still correct for embedded libSQL).
Adding hybrid/approximate indexing beyond the default HNSW strategy provided by libsql_vector_idx.
Parallelizing embedding batches across providers (separate feature).
Horizontally scaling across processes.
Allowing more than one job for the exact same (repositoryId, versionId) pair to run concurrently (still serialized — duplicate detection in JobQueue is unchanged).
A full admin authentication system (out of scope).
Mobile-responsive redesign of the entire admin section (out of scope).

Problem Detail

1. Vector Search — Full Table Scan in JavaScript

File: src/lib/server/search/vector.search.ts

// Current: no LIMIT, loads ALL embeddings for repo into memory
const rows = this.db.prepare<unknown[], RawEmbeddingRow>(sql).all(...params);

const scored: VectorSearchResult[] = rows.map((row) => {
	const embedding = new Float32Array(
		row.embedding.buffer,
		row.embedding.byteOffset,
		row.embedding.byteLength / 4
	);
	return { snippetId: row.snippet_id, score: cosineSimilarity(queryEmbedding, embedding) };
});

return scored.sort((a, b) => b.score - a.score).slice(0, limit);

For a repo with N snippets and D dimensions, this allocates N × D × 4 bytes per query. At N=100k and D=1536, that is ~600 MB allocated synchronously. The result is sorted entirely in JS before the top-k is returned. With a native vector index, SQLite returns only the top-k rows.

2. Missing Composite Indexes

The snippets, documents, and snippet_embeddings tables are queried with multi-column WHERE predicates in every hot path, but no composite indexes exist:

Table	Filter columns	Used in
`snippets`	`(repository_id, version_id)`	All search, diff, clone
`snippets`	`(repository_id, type)`	Type-filtered queries
`documents`	`(repository_id, version_id)`	Diff strategy, clone
`snippet_embeddings`	`(profile_id, snippet_id)`	`findSnippetIdsMissingEmbeddings` LEFT JOIN
`repositories`	`(state)`	`searchRepositories` WHERE `state = 'indexed'`
`indexing_jobs`	`(repository_id, status)`	Job status lookups

Without these indexes, SQLite performs a B-tree scan of the primary key and filters rows in memory. On a 500k-row snippets table this is the dominant cost of every search.

4. Admin UI — Current Problems

File: src/routes/admin/jobs/+page.svelte, src/lib/components/IndexingProgress.svelte

Problem	Location	Impact
`IndexingProgress` polls every 2 s via `setInterval` + `fetch`	`IndexingProgress.svelte`	Constant HTTP traffic; progress lags by up to 2 s
No status or repository filter controls	`admin/jobs/+page.svelte`	With 200 tag jobs, finding a specific one requires scrolling
No worker status panel	— (no endpoint exists)	Operator cannot see which workers are busy or idle
`alert()` for errors, `confirm()` for cancel	`admin/jobs/+page.svelte` — `showToast()`	Blocks the entire browser tab; unusable under parallel jobs
`actionInProgress` is a single string, not per-job	`admin/jobs/+page.svelte`	Pausing job A disables buttons on all other jobs
No skeleton loading — blank + spinner on first load	`admin/jobs/+page.svelte`	Layout shift; no structural preview while data loads
Hard-coded `limit=50` query, no pagination	`admin/jobs/+page.svelte:fetchJobs()`	Page truncates silently for large queues

3. Under-configured SQLite Connection

File: src/lib/server/db/client.ts and src/lib/server/db/index.ts

Current pragmas:

client.pragma('journal_mode = WAL');
client.pragma('foreign_keys = ON');
client.pragma('busy_timeout = 5000');

Missing:

synchronous = NORMAL — halves fsync overhead vs the default FULL; safe with WAL
cache_size = -65536 — 64 MB page cache; default is 2 MB
temp_store = MEMORY — temp tables and sort spills stay in RAM
mmap_size = 268435456 — 256 MB memory-mapped read path; bypasses system call overhead for reads
wal_autocheckpoint = 1000 — more frequent checkpoints prevent WAL growth

Architecture

Drop-In Replacement: `@libsql/better-sqlite3`

@libsql/better-sqlite3 is published by Turso and implemented as a Node.js native addon wrapping the libSQL embedded engine. The exported class is API-compatible with better-sqlite3:

// before
import Database from 'better-sqlite3';
const db = new Database('/path/to/file.db');
db.pragma('journal_mode = WAL');
const rows = db.prepare('SELECT ...').all(...params);

// after — identical code
import Database from '@libsql/better-sqlite3';
const db = new Database('/path/to/file.db');
db.pragma('journal_mode = WAL');
const rows = db.prepare('SELECT ...').all(...params);

All of the following continue to work unchanged:

drizzle-orm/better-sqlite3 adapter and migrate helper
drizzle-kit with dialect: 'sqlite'
Prepared statements, transactions, WAL pragmas, foreign keys
Worker thread per-thread connections (worker-entry.ts, embed-worker-entry.ts)
All type Database from 'better-sqlite3' type imports (replaced in lock-step)

Vector Index Design

libSQL provides libsql_vector_idx() — a virtual index type stored in a shadow table alongside the main table. Once indexed, KNN queries use a SQL vector_top_k() function:

-- KNN: return top-k snippet IDs closest to the query vector
SELECT snippet_id
FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?)

vector_from_float32(blob) accepts the same raw little-endian Float32 bytes currently stored in the embedding blob column. No data migration is needed — the existing blob column can be re-indexed with libsql_vector_idx pointing at the bytes-stored column.

The index strategy:

Add a generated vec_embedding column of type F32_BLOB(dimensions) to snippet_embeddings, populated from the existing embedding blob via a migration trigger.
Create the vector index: CREATE INDEX idx_snippet_embeddings_vec ON snippet_embeddings(vec_embedding) USING libsql_vector_idx(vec_embedding).
Rewrite VectorSearch.vectorSearch() to use vector_top_k() with a two-step join instead of the in-memory loop.
Update EmbeddingService.embedSnippets() to write vec_embedding on insert.

Dimensions are profile-specific. Because the index is per-column, a separate index is needed per embedding dimensionality. For v1, a single index covering the default profile's dimensions is sufficient; multi-profile KNN can be handled with a WHERE profile_id = ? pre-filter on the vector_top_k results.

Updated Vector Search Query

vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;

    // Encode query vector as raw bytes (same format as stored blobs)
    const queryBytes = Buffer.from(queryEmbedding.buffer);

    // Use libSQL vector_top_k for ANN — returns ordered (rowid, distance) pairs
    let sql = `
        SELECT se.snippet_id,
               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS score
        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
        JOIN snippet_embeddings se ON se.rowid = knn.id
        JOIN snippets s ON s.id = se.snippet_id
        WHERE s.repository_id = ?
          AND se.profile_id = ?
    `;
    const params: unknown[] = [queryBytes, queryBytes, limit * 4, repositoryId, profileId];

    if (versionId) {
        sql += ' AND s.version_id = ?';
        params.push(versionId);
    }

    sql += ' ORDER BY score ASC LIMIT ?';
    params.push(limit);

    return this.db
        .prepare<unknown[], { snippet_id: string; score: number }>(sql)
        .all(...params)
        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.score }));
}

vector_distance_cos returns distance (0 = identical), so 1 - distance gives a similarity score in [0, 1] matching the existing VectorSearchResult.score contract.

Implementation Plan

Phase 1 — Package Swap (no logic changes)

Files touched: package.json, all .ts files that import better-sqlite3

In package.json:
- Remove "better-sqlite3": "^12.6.2" from dependencies
- Add "@libsql/better-sqlite3": "^0.4.0" to dependencies
- Remove "@types/better-sqlite3": "^7.6.13" from devDependencies
- @libsql/better-sqlite3 ships its own TypeScript declarations

Replace all import statements (35 occurrences across 19 files):

Old import	New import
`import Database from 'better-sqlite3'`	`import Database from '@libsql/better-sqlite3'`
`import type Database from 'better-sqlite3'`	`import type Database from '@libsql/better-sqlite3'`
`import { drizzle } from 'drizzle-orm/better-sqlite3'`	unchanged
`import { migrate } from 'drizzle-orm/better-sqlite3/migrator'`	unchanged

Affected production files:

src/lib/server/db/index.ts
src/lib/server/db/client.ts
src/lib/server/embeddings/embedding.service.ts
src/lib/server/pipeline/indexing.pipeline.ts
src/lib/server/pipeline/job-queue.ts
src/lib/server/pipeline/startup.ts
src/lib/server/pipeline/worker-entry.ts
src/lib/server/pipeline/embed-worker-entry.ts
src/lib/server/pipeline/differential-strategy.ts
src/lib/server/search/vector.search.ts
src/lib/server/search/hybrid.search.service.ts
src/lib/server/search/search.service.ts
src/lib/server/services/repository.service.ts
src/lib/server/services/version.service.ts
src/lib/server/services/embedding-settings.service.ts

Affected test files (same mechanical replacement):

src/routes/api/v1/api-contract.integration.test.ts
src/routes/api/v1/sse-and-settings.integration.test.ts
src/routes/settings/page.server.test.ts
src/lib/server/db/schema.test.ts
src/lib/server/embeddings/embedding.service.test.ts
src/lib/server/pipeline/indexing.pipeline.test.ts
src/lib/server/pipeline/differential-strategy.test.ts
src/lib/server/search/search.service.test.ts
src/lib/server/search/hybrid.search.service.test.ts
src/lib/server/services/repository.service.test.ts
src/lib/server/services/version.service.test.ts
src/routes/api/v1/settings/embedding/server.test.ts
src/routes/api/v1/libs/[id]/index/server.test.ts
src/routes/api/v1/libs/[id]/versions/discover/server.test.ts

Run all tests — they should pass with zero logic changes: npm test

Phase 2 — Pragma Hardening

Files touched: src/lib/server/db/client.ts, src/lib/server/db/index.ts

Add the following pragmas to both connection factories (raw client and initializeDatabase()):

client.pragma('synchronous = NORMAL');
client.pragma('cache_size = -65536'); // 64 MB
client.pragma('temp_store = MEMORY');
client.pragma('mmap_size = 268435456'); // 256 MB
client.pragma('wal_autocheckpoint = 1000');

Worker threads (worker-entry.ts, embed-worker-entry.ts) open their own connections — apply the same pragmas there.

Phase 3 — Composite Indexes (Drizzle migration)

Files touched: src/lib/server/db/schema.ts, new migration SQL file

Add indexes in schema.ts using Drizzle's index() helper:

// snippets table
export const snippets = sqliteTable(
	'snippets',
	{
		/* unchanged */
	},
	(t) => [
		index('idx_snippets_repo_version').on(t.repositoryId, t.versionId),
		index('idx_snippets_repo_type').on(t.repositoryId, t.type)
	]
);

// documents table
export const documents = sqliteTable(
	'documents',
	{
		/* unchanged */
	},
	(t) => [index('idx_documents_repo_version').on(t.repositoryId, t.versionId)]
);

// snippet_embeddings table
export const snippetEmbeddings = sqliteTable(
	'snippet_embeddings',
	{
		/* unchanged */
	},
	(table) => [
		primaryKey({ columns: [table.snippetId, table.profileId] }), // unchanged
		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
	]
);

// repositories table
export const repositories = sqliteTable(
	'repositories',
	{
		/* unchanged */
	},
	(t) => [index('idx_repositories_state').on(t.state)]
);

// indexing_jobs table
export const indexingJobs = sqliteTable(
	'indexing_jobs',
	{
		/* unchanged */
	},
	(t) => [index('idx_jobs_repo_status').on(t.repositoryId, t.status)]
);

Generate and apply migration: npm run db:generate && npm run db:migrate

Phase 4 — Vector Column and Index (Drizzle migration)

Files touched: src/lib/server/db/schema.ts, new migration SQL, src/lib/server/search/vector.search.ts, src/lib/server/embeddings/embedding.service.ts

4a. Schema: add `vec_embedding` column

Add vec_embedding to snippet_embeddings. Drizzle does not have a F32_BLOB column type helper; use a raw SQL column:

import { sql } from 'drizzle-orm';
import { customType } from 'drizzle-orm/sqlite-core';

const f32Blob = (name: string, dimensions: number) =>
	customType<{ data: Buffer }>({
		dataType() {
			return `F32_BLOB(${dimensions})`;
		}
	})(name);

export const snippetEmbeddings = sqliteTable(
	'snippet_embeddings',
	{
		snippetId: text('snippet_id')
			.notNull()
			.references(() => snippets.id, { onDelete: 'cascade' }),
		profileId: text('profile_id')
			.notNull()
			.references(() => embeddingProfiles.id, { onDelete: 'cascade' }),
		model: text('model').notNull(),
		dimensions: integer('dimensions').notNull(),
		embedding: blob('embedding').notNull(), // existing blob — kept for backward compat
		vecEmbedding: f32Blob('vec_embedding', 1536), // libSQL vector column (nullable during migration fill)
		createdAt: integer('created_at').notNull()
	},
	(table) => [
		primaryKey({ columns: [table.snippetId, table.profileId] }),
		index('idx_embeddings_profile').on(table.profileId, table.snippetId)
	]
);

Because dimensionality is fixed per model, F32_BLOB(1536) covers OpenAI text-embedding-3-small/large. A follow-up can parameterize this per profile.

4b. Migration SQL: populate `vec_embedding` from existing `embedding` blob and create the vector index

The vector index cannot be expressed in SQL DDL portable across Drizzle — it must be applied in the FTS-style custom SQL file (src/lib/server/db/fts.sql or an equivalent vectors.sql):

-- Backfill vec_embedding from existing raw blob data
UPDATE snippet_embeddings
SET vec_embedding = vector_from_float32(embedding)
WHERE vec_embedding IS NULL AND embedding IS NOT NULL;

-- Create the HNSW vector index (libSQL extension syntax)
CREATE INDEX IF NOT EXISTS idx_snippet_embeddings_vec
ON snippet_embeddings(vec_embedding)
USING libsql_vector_idx(vec_embedding, 'metric=cosine', 'compress_neighbors=float8', 'max_neighbors=20');

Add a call to this SQL in initializeDatabase() alongside the existing fts.sql execution:

const vectorSql = readFileSync(join(__dirname, 'vectors.sql'), 'utf-8');
client.exec(vectorSql);

4c. Update `EmbeddingService.embedSnippets()`

When inserting a new embedding, write both the blob and the vec column:

const insert = this.db.prepare<[string, string, string, number, Buffer, Buffer]>(`
    INSERT OR REPLACE INTO snippet_embeddings
        (snippet_id, profile_id, model, dimensions, embedding, vec_embedding, created_at)
    VALUES (?, ?, ?, ?, ?, vector_from_float32(?), unixepoch())
`);

// inside the transaction:
insert.run(
	snippet.id,
	this.profileId,
	embedding.model,
	embedding.dimensions,
	embeddingBuffer,
	embeddingBuffer // same bytes — vector_from_float32() interprets them
);

4d. Rewrite `VectorSearch.vectorSearch()`

Replace the full-scan JS loop with vector_top_k():

vectorSearch(queryEmbedding: Float32Array, options: VectorSearchOptions): VectorSearchResult[] {
    const { repositoryId, versionId, profileId = 'local-default', limit = 50 } = options;

    const queryBytes = Buffer.from(queryEmbedding.buffer);
    const candidatePool = limit * 4; // over-fetch for post-filter

    let sql = `
        SELECT se.snippet_id,
               vector_distance_cos(se.vec_embedding, vector_from_float32(?)) AS distance
        FROM vector_top_k('idx_snippet_embeddings_vec', vector_from_float32(?), ?) AS knn
        JOIN snippet_embeddings se ON se.rowid = knn.id
        JOIN snippets s ON s.id = se.snippet_id
        WHERE s.repository_id = ?
          AND se.profile_id = ?
    `;
    const params: unknown[] = [queryBytes, queryBytes, candidatePool, repositoryId, profileId];

    if (versionId) {
        sql += ' AND s.version_id = ?';
        params.push(versionId);
    }

    sql += ' ORDER BY distance ASC LIMIT ?';
    params.push(limit);

    return this.db
        .prepare<unknown[], { snippet_id: string; distance: number }>(sql)
        .all(...params)
        .map((row) => ({ snippetId: row.snippet_id, score: 1 - row.distance }));
}

The score contract is preserved (1 = identical, 0 = orthogonal). The cosineSimilarity helper function is no longer called at runtime but can be kept for unit tests.

Phase 5 — Per-Job Serialization Key Fix

Files touched: src/lib/server/pipeline/worker-pool.ts

The current serialization guard uses a bare repositoryId:

// current
private runningRepoIds = new Set<string>();
// blocks any job whose repositoryId is already in the set
const jobIdx = this.jobQueue.findIndex((j) => !this.runningRepoIds.has(j.repositoryId));

Different tags of the same repository write to completely disjoint rows (version_id-partitioned documents, snippets, and embeddings). The only genuine conflict is two jobs for the same (repositoryId, versionId) pair, which JobQueue.enqueue() already prevents via the status IN ('queued', 'running') deduplication check.

Change the guard to key on the compound pair:

// replace Set<string> with Set<string> keyed on compound pair
private runningJobKeys = new Set<string>();

private jobKey(repositoryId: string, versionId?: string | null): string {
    return `${repositoryId}|${versionId ?? ''}`;
}

Update all four sites that read/write runningRepoIds:

Location	Old	New
`dispatch()` find	`!this.runningRepoIds.has(j.repositoryId)`	`!this.runningJobKeys.has(this.jobKey(j.repositoryId, j.versionId))`
`dispatch()` add	`this.runningRepoIds.add(job.repositoryId)`	`this.runningJobKeys.add(this.jobKey(job.repositoryId, job.versionId))`
`onWorkerMessage` done/failed delete	`this.runningRepoIds.delete(runningJob.repositoryId)`	`this.runningJobKeys.delete(this.jobKey(runningJob.repositoryId, runningJob.versionId))`
`onWorkerExit` delete	same	same

The QueuedJob and RunningJob interfaces already carry versionId — no type changes needed.

The only serialized case that remains is versionId = null (default-branch re-index) paired with itself, which maps to the stable key "repositoryId|" — correctly deduplicated.

Phase 6 — Dedicated Write Worker (Single-Writer Pattern)

Files touched: src/lib/server/pipeline/worker-types.ts, src/lib/server/pipeline/write-worker-entry.ts (new), src/lib/server/pipeline/worker-entry.ts, src/lib/server/pipeline/worker-pool.ts

Motivation

With Phase 5 in place, N tags of the same library can index in parallel. Each parse worker currently opens its own DB connection and holds the write lock while storing parsed snippets. Under N concurrent writers, each worker spends the majority of its wall-clock time waiting in busy_timeout back-off. The fix is the single-writer pattern: one dedicated write worker owns the only writable DB connection; parse workers become stateless CPU workers that send write batches over postMessage.

Parse Worker 1 ──┐ WriteRequest (docs[], snippets[])    ┌── WriteAck
Parse Worker 2 ──┼─────────────────────────────────────► Write Worker (sole DB writer)
Parse Worker N ──┘                                       └── single better-sqlite3 connection

New message types (`worker-types.ts`)

export interface WriteRequest {
	type: 'write';
	jobId: string;
	documents: SerializedDocument[];
	snippets: SerializedSnippet[];
}

export interface WriteAck {
	type: 'write_ack';
	jobId: string;
	documentCount: number;
	snippetCount: number;
}

export interface WriteError {
	type: 'write_error';
	jobId: string;
	error: string;
}

// SerializedDocument / SerializedSnippet mirror the DB column shapes
// (plain objects, safe to transfer via structured clone)

Write worker (`write-worker-entry.ts`)

The write worker:

Opens its own Database connection (WAL mode, all pragmas from Phase 2)
Listens for WriteRequest messages
Wraps each batch in a single transaction
Posts WriteAck or WriteError back to the parent, which forwards the ack to the originating parse worker by jobId

import Database from '@libsql/better-sqlite3';
import { workerData, parentPort } from 'node:worker_threads';
import type { WriteRequest, WriteAck, WriteError } from './worker-types.js';

const db = new Database((workerData as WorkerInitData).dbPath);
db.pragma('journal_mode = WAL');
db.pragma('synchronous = NORMAL');
db.pragma('cache_size = -65536');
db.pragma('foreign_keys = ON');

const insertDoc = db.prepare(`INSERT OR REPLACE INTO documents     (...) VALUES (...)`);
const insertSnippet = db.prepare(`INSERT OR REPLACE INTO snippets      (...) VALUES (...)`);

const writeBatch = db.transaction((req: WriteRequest) => {
	for (const doc of req.documents) insertDoc.run(doc);
	for (const snip of req.snippets) insertSnippet.run(snip);
});

parentPort!.on('message', (req: WriteRequest) => {
	try {
		writeBatch(req);
		const ack: WriteAck = {
			type: 'write_ack',
			jobId: req.jobId,
			documentCount: req.documents.length,
			snippetCount: req.snippets.length
		};
		parentPort!.postMessage(ack);
	} catch (err) {
		const fail: WriteError = { type: 'write_error', jobId: req.jobId, error: String(err) };
		parentPort!.postMessage(fail);
	}
});

Parse worker changes (`worker-entry.ts`)

Parse workers lose their DB connection. IndexingPipeline receives a sendWrite callback instead of a db instance. After parsing each file batch, the worker calls sendWrite({ type: 'write', jobId, documents, snippets }) and awaits the WriteAck before continuing. This keeps back-pressure: a slow write worker naturally throttles the parse workers without additional semaphores.

WorkerPool changes

Spawn one write worker at startup (always, regardless of embedding config)
Route incoming write_ack / write_error messages to the correct waiting parse worker via a Map<jobId, resolve> promise registry
The write worker is separate from the embed worker — embed writes (snippet_embeddings) can still go through the write worker by adding an EmbedWriteRequest message type, or remain in the embed worker since embedding runs after parsing completes (no lock contention with active parse jobs)

Conflict analysis with Phase 5

Phases 5 and 6 compose cleanly:

Phase 5 allows multiple (repo, versionId) jobs to run concurrently
Phase 6 ensures all those concurrent jobs share a single write path — contention is eliminated by design
The write worker is stateless with respect to job identity; it just executes batches in arrival order within a FIFO message queue (Node.js postMessage is ordered)
The embed worker remains a separate process (it runs after parse completes, so it never overlaps with active parse writes for the same job)

Phase 7 — Admin UI Overhaul

Files touched:

src/routes/admin/jobs/+page.svelte — rebuilt
src/routes/api/v1/workers/+server.ts — new endpoint
src/lib/components/admin/JobStatusBadge.svelte — extend with spinner variant
src/lib/components/admin/JobSkeleton.svelte — new
src/lib/components/admin/WorkerStatusPanel.svelte — new
src/lib/components/admin/Toast.svelte — new
src/lib/components/IndexingProgress.svelte — switch to SSE

7a. New API endpoint: `GET /api/v1/workers`

The WorkerPool singleton tracks running jobs in runningJobs: Map<Worker, RunningJob> and idle workers in idleWorkers: Worker[]. Expose this state as a lightweight REST snapshot:

// GET /api/v1/workers
// Response shape:
interface WorkersResponse {
	concurrency: number; // configured max workers
	active: number; // workers with a running job
	idle: number; // workers waiting for work
	workers: WorkerStatus[]; // one entry per spawned parse worker
}

interface WorkerStatus {
	index: number; // worker slot (0-based)
	state: 'idle' | 'running'; // current state
	jobId: string | null; // null when idle
	repositoryId: string | null;
	versionId: string | null;
}

The route handler calls getPool().getStatus() — add a getStatus(): WorkersResponse method to WorkerPool that reads runningJobs and idleWorkers without any DB call. This is read-only and runs on the main thread.

The SSE stream at /api/v1/jobs/stream should emit a new worker-status event type whenever a worker transitions idle ↔ running (on dispatch() and job completion). This allows the worker panel to update in real-time without polling the REST endpoint.

7b. `GET /api/v1/jobs` — add `repositoryId` free-text and multi-status filter

The existing endpoint already accepts repositoryId (exact match) and status (single value). Extend:

repositoryId to also support prefix match (e.g. ?repositoryId=/facebook returns all /facebook/* repos)
status to accept comma-separated values: ?status=queued,running
page and pageSize query params (default pageSize=50, max 200) in addition to limit for backwards compat

Return { jobs, total, page, pageSize } with total always reflecting the unfiltered-by-page count.

7c. New component: `JobSkeleton.svelte`

A set of skeleton rows matching the job table structure. Shown during the initial fetch before any data arrives. Uses Tailwind animate-pulse:

<!-- renders N skeleton rows -->
<script lang="ts">
	let { rows = 5 }: { rows?: number } = $props();
</script>

{#each Array(rows) as _, i (i)}
	<tr>
		<td class="px-6 py-4">
			<div class="h-4 w-48 animate-pulse rounded bg-gray-200"></div>
			<div class="mt-1 h-3 w-24 animate-pulse rounded bg-gray-100"></div>
		</td>
		<td class="px-6 py-4">
			<div class="h-5 w-16 animate-pulse rounded-full bg-gray-200"></div>
		</td>
		<td class="px-6 py-4">
			<div class="h-4 w-20 animate-pulse rounded bg-gray-200"></div>
		</td>
		<td class="px-6 py-4">
			<div class="h-2 w-32 animate-pulse rounded-full bg-gray-200"></div>
		</td>
		<td class="px-6 py-4">
			<div class="h-4 w-28 animate-pulse rounded bg-gray-200"></div>
		</td>
		<td class="px-6 py-4 text-right">
			<div class="ml-auto h-7 w-20 animate-pulse rounded bg-gray-200"></div>
		</td>
	</tr>
{/each}

7d. New component: `Toast.svelte`

Replaces all alert() / console.log() calls in the jobs page. Renders a fixed-position stack in the bottom-right corner. Each toast auto-dismisses after 4 seconds and can be manually closed:

<!-- Usage: bind a toasts array and call push({ message, type }) -->
<script lang="ts">
	export interface ToastItem {
		id: string;
		message: string;
		type: 'success' | 'error' | 'info';
	}

	let { toasts = $bindable([]) }: { toasts: ToastItem[] } = $props();

	function dismiss(id: string) {
		toasts = toasts.filter((t) => t.id !== id);
	}
</script>

<div class="fixed right-4 bottom-4 z-50 flex flex-col gap-2">
	{#each toasts as toast (toast.id)}
		<!-- color by type, close button, auto-dismiss via onmount timer -->
	{/each}
</div>

The jobs page replaces showToast() with pushing onto the bound toasts array. The confirm() for cancel is replaced with an inline confirmation state per job (pendingCancelId) that shows "Confirm cancel?" / "Yes" / "No" buttons inside the row.

7e. New component: `WorkerStatusPanel.svelte`

A compact panel displayed above the job table showing the worker pool health. Subscribes to the worker-status SSE events and falls back to polling GET /api/v1/workers every 5 s on SSE error:

┌─────────────────────────────────────────────────────────┐
│  Workers  [2 / 4 active]  ████░░░░  50%                 │
│  Worker 0  ● running  /facebook/react / v18.3.0         │
│  Worker 1  ● running  /facebook/react / v17.0.2         │
│  Worker 2  ○ idle                                       │
│  Worker 3  ○ idle                                       │
└─────────────────────────────────────────────────────────┘

Each worker row shows: slot index, status dot (animated green pulse for running), repository ID, version tag, and a link to the job row in the table below.

7f. Filter bar on the jobs page

Add a filter strip between the page header and the table:

[ Repository: _______________ ]  [ Status: ▾ all ]  [ 🔍 Apply ]  [ ↺ Reset ]

Repository field: free-text input, matches repositoryId prefix (e.g. /facebook shows all /facebook/*)
Status dropdown: multi-select checkboxes for queued, running, paused, cancelled, done, failed; default = all
Filters are applied client-side against the loaded jobs array for instant feedback, and also re-fetched from the API on Apply to get the correct total count
Filter state is mirrored to URL search params (?repo=...&status=...) so the view is bookmarkable and survives refresh

7g. Per-job action spinner and disabled state

Replace the single actionInProgress: string | null with a Map<string, 'pausing' | 'resuming' | 'cancelling'>:

let actionInProgress = $state(new Map<string, 'pausing' | 'resuming' | 'cancelling'>());

Each action button shows an inline spinner (small animate-spin circle) and is disabled only for that row. Other rows remain fully interactive during the action. On completion the entry is deleted from the map.

7h. `IndexingProgress.svelte` — switch from polling to SSE

The component currently uses setInterval + fetch at 2 s. Replace with the per-job SSE stream already available at /api/v1/jobs/{id}/stream:

// replace the $effect body
$effect(() => {
	job = null;
	const es = new EventSource(`/api/v1/jobs/${jobId}/stream`);

	es.addEventListener('job-progress', (event) => {
		const data = JSON.parse(event.data);
		job = { ...job, ...data };
	});

	es.addEventListener('job-done', () => {
		void fetch(`/api/v1/jobs/${jobId}`)
			.then((r) => r.json())
			.then((d) => {
				job = d.job;
				oncomplete?.();
			});
		es.close();
	});

	es.addEventListener('job-failed', (event) => {
		const data = JSON.parse(event.data);
		job = { ...job, status: 'failed', error: data.error };
		oncomplete?.();
		es.close();
	});

	es.onerror = () => {
		// on SSE failure fall back to a single fetch to get current state
		es.close();
		void fetch(`/api/v1/jobs/${jobId}`)
			.then((r) => r.json())
			.then((d) => {
				job = d.job;
			});
	};

	return () => es.close();
});

This reduces network traffic from 1 request/2 s to zero requests during active indexing — updates arrive as server-push events.

7i. Pagination on the jobs page

Replace the hard-coded ?limit=50 fetch with paginated requests:

let currentPage = $state(1);
const PAGE_SIZE = 50;

async function fetchJobs() {
	const params = new URLSearchParams({
		page: String(currentPage),
		pageSize: String(PAGE_SIZE),
		...(filterRepo ? { repositoryId: filterRepo } : {}),
		...(filterStatuses.length ? { status: filterStatuses.join(',') } : {})
	});
	const data = await fetch(`/api/v1/jobs?${params}`).then((r) => r.json());
	jobs = data.jobs;
	total = data.total;
}

Render a simple « Prev Page N of M Next » control below the table, hidden when total <= PAGE_SIZE.

Acceptance Criteria

npm install with @libsql/better-sqlite3 succeeds; better-sqlite3 is absent from node_modules
All existing unit and integration tests pass after Phase 1 import swap
npm run db:migrate applies the composite index migration cleanly against an existing database
npm run db:migrate applies the vector column migration cleanly; sql> SELECT vec_embedding FROM snippet_embeddings LIMIT 1 returns a non-NULL value for any previously-embedded snippet
GET /api/v1/context?libraryId=...&query=... with a semantic-mode or hybrid-mode request returns results in ≤ 200 ms on a repository with 50k+ snippets (vs previous multi-second response)
Memory profiled during a /context request shows no allocation spike proportional to repository size
EXPLAIN QUERY PLAN on the snippets search query shows SCAN snippets USING INDEX idx_snippets_repo_version instead of SCAN snippets
Worker threads (worker-entry.ts, embed-worker-entry.ts) start and complete an indexing job successfully after the package swap
drizzle-kit studio connects and browses the migrated database
Re-indexing a repository after the migration correctly populates vec_embedding on all new snippets
cosineSimilarity unit tests still pass (function is kept)
Starting two indexing jobs for different tags of the same repository simultaneously results in both jobs reaching running state concurrently (not one waiting for the other)
Starting two indexing jobs for the same (repositoryId, versionId) pair returns the existing job (deduplication unchanged)
With 4 parse workers and 4 concurrent tag jobs, zero SQLITE_BUSY errors appear in logs
Write worker is present in the process list during active indexing (worker_threads inspector shows write-worker-entry)
A WriteError from the write worker marks the originating job as failed with the error message propagated to the SSE stream
GET /api/v1/workers returns a WorkersResponse JSON object with correct active, idle, and workers[] fields while jobs are in-flight
The worker-status SSE event is emitted by /api/v1/jobs/stream whenever a worker transitions state
The admin jobs page shows skeleton rows (not a blank screen) during the initial fetchJobs() call
No alert() or confirm() calls exist in admin/jobs/+page.svelte after this change; all notifications go through Toast.svelte
Pausing job A while job B is also in progress does not disable job B's action buttons
The status filter multi-select correctly restricts the visible job list; the URL updates to reflect the filter state
The repository prefix filter ?repositoryId=/facebook returns all jobs whose repositoryId starts with /facebook
Paginating past page 1 fetches the next batch from the API, not from the client-side array
IndexingProgress.svelte has no setInterval call; it uses EventSource for progress updates
The WorkerStatusPanel shows the correct number of running workers live during a multi-tag indexing run
Refreshing the jobs page with ?repo=/facebook/react&status=running pre-populates the filters and fetches with those params

Migration Safety

Backward Compatibility

The embedding blob column is kept. The vec_embedding column is nullable during the backfill window and becomes populated as:

The UPDATE in vectors.sql fills all existing rows on startup
New embeddings populate it at insert time

If vec_embedding IS NULL for a row (e.g., a row inserted before the migration runs), the vector index silently omits that row from results. The fallback in HybridSearchService to FTS-only mode still applies when no embeddings exist, so degraded-but-correct behavior is preserved.

Rollback

Rollback before Phase 4 (vector column): remove @libsql/better-sqlite3, restore better-sqlite3, restore imports. No schema changes have been made.

Rollback after Phase 4: schema now has vec_embedding column. Drop the column with a migration reversal and restore imports. The embedding blob is intact throughout — no data loss.

SQLite File Compatibility

libSQL embedded mode reads and writes standard SQLite 3 files. The WAL file, page size, and encoding are unchanged. An existing production database opened with @libsql/better-sqlite3 is fully readable and writable. The vector index is stored in a shadow table idx_snippet_embeddings_vec_shadow which better-sqlite3 would ignore if rolled back (it is a regular table with a special name).

Dependencies

Package	Action	Reason
`better-sqlite3`	Remove from `dependencies`	Replaced
`@types/better-sqlite3`	Remove from `devDependencies`	`@libsql/better-sqlite3` ships own types
`@libsql/better-sqlite3`	Add to `dependencies`	Drop-in libSQL node addon
`drizzle-orm`	No change	`better-sqlite3` adapter works unchanged
`drizzle-kit`	No change	`dialect: 'sqlite'` correct for embedded libSQL

No new runtime dependencies beyond the package replacement.

Testing Strategy

Unit Tests

src/lib/server/search/vector.search.ts: add test asserting KNN results are correct for a seeded 3-vector table; verify memory is not proportional to table size (mock db.prepare to assert no unbounded .all() is called)
src/lib/server/embeddings/embedding.service.ts: existing tests cover insert round-trips; verify vec_embedding column is non-NULL after embedSnippets()

Integration Tests

api-contract.integration.test.ts: existing tests already use new Database(':memory:') — these continue to work with @libsql/better-sqlite3 because the in-memory path is identical
Add one test to api-contract.integration.test.ts: seed a repository + multiple embeddings, call /api/v1/context in semantic mode, assert non-empty results and response time < 500ms on in-memory DB

UI Tests

src/routes/admin/jobs/+page.svelte: add Vitest browser tests (Playwright) verifying:
- Skeleton rows appear before the first fetch resolves (mock fetch to delay 200 ms)
- Status filter restricts displayed rows; URL param updates
- Pausing job A leaves job B's buttons enabled
- Toast appears and auto-dismisses on successful pause
- Cancel confirm flow shows inline confirmation, not window.confirm
src/lib/components/IndexingProgress.svelte: unit test that no setInterval is created; verify EventSource is opened with the correct URL

Performance Regression Gate

Add a benchmark script scripts/bench-vector-search.mjs that:

Creates an in-memory libSQL database
Seeds 10000 snippet embeddings (random Float32Array, 1536 dims)
Runs 100 vectorSearch() calls
Asserts p99 < 50 ms

This gates the CI check on Phase 4 correctness and speed.

44 KiB Raw Blame History Unescape Escape