- BM25 ranking via SQLite FTS5 bm25() function
- Query preprocessor with wildcard expansion and special char escaping
- Library search with composite scoring (name match, trust score, snippet count)
- Trust score computation from stars, coverage, and source type
- Response formatters for library and snippet results
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Markdown parser with heading-based section splitting and code block extraction
- Code file parser with regex boundary detection for 10+ languages
- Sliding window chunker with configurable token limits and overlap
- Language detection from file extensions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GitHub crawler with rate limiting, semaphore concurrency, retry logic
- File filtering by extension, size, and trueref.json rules
- Local filesystem crawler with SHA-256 checksums and progress callbacks
- Shared types and file filter logic between both crawlers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the end-to-end indexing pipeline with a SQLite-backed job
queue, startup recovery, and REST API endpoints for job status.
- IndexingPipeline: orchestrates crawl → parse → atomic replace → embed
→ repo stats update with progress tracking at each stage
- JobQueue: sequential SQLite-backed queue (no external broker), deduplicates
active jobs per repository, drains queued jobs on startup
- startup.ts: stale job recovery (running→failed), repo state reset, singleton
initialization wired from hooks.server.ts
- GET /api/v1/jobs with repositoryId/status/limit filtering
- GET /api/v1/jobs/[id] single job lookup
- hooks.server.ts: initializes DB and pipeline on server start
- 18 unit tests covering queue, pipeline stages, recovery, and atomicity
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add EmbeddingProvider interface with OpenAI-compatible, local (optional
@xenova/transformers via dynamic import), and Noop (FTS5-only fallback)
implementations. EmbeddingService batches requests and persists Float32Array
blobs to snippet_embeddings. GET/PUT /api/v1/settings/embedding endpoints
read and write embedding config from the settings table.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add RepositoryService with full CRUD, ID resolution helpers, input
validation, six SvelteKit API routes (GET/POST /api/v1/libs,
GET/PATCH/DELETE /api/v1/libs/:id, POST /api/v1/libs/:id/index), and
37 unit tests covering all service operations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>