Commit Graph

37 Commits

Author SHA1 Message Date
Giancarmine Salucci
e7a2a83cdb feat(TRUEREF-0020): add job status page with pause/resume/cancel controls
- Extend indexing_jobs schema to support 'paused' and 'cancelled' status
- Add JobQueue methods: pauseJob(), resumeJob(), cancelJob()
- Create POST /api/v1/jobs/[id]/{pause,resume,cancel} endpoints
- Implement /admin/jobs page with auto-refresh (3s polling)
- Add JobStatusBadge component with color-coded status display
- Action buttons appear contextually based on job status
- Optimistic UI updates with error handling
- All 477 existing tests pass, no regressions
2026-03-25 20:38:14 +01:00
Giancarmine Salucci
9519a66cef test(embeddings): fix 6 remaining test failures
- Fix schema.test.ts: use Unix timestamp integers instead of Date objects for snippet_embeddings.createdAt
- Fix embedding.service.test.ts: use 'local-default' profile instead of non-existent 'test-profile', remove require() calls and use proper ESM imports
- Fix hybrid.search.service.test.ts: update VectorSearch.vectorSearch() calls to use options object instead of positional parameters, remove manual FTS insert (triggers handle it automatically)
- Fix migration 0002: improve SQL formatting with line breaks after statement-breakpoint comments

All 459 tests now passing (18 skipped).
2026-03-25 19:41:24 +01:00
Giancarmine Salucci
169df4d984 feat(TRUEREF-0020): add embedding profiles, default local embeddings, and version-scoped semantic retrieval
- Add embedding_profiles table with provider registry pattern
- Install @xenova/transformers as runtime dependency
- Update snippet_embeddings with composite PK (snippet_id, profile_id)
- Seed default local profile using Xenova/all-MiniLM-L6-v2
- Add provider registry (local-transformers, openai-compatible)
- Update EmbeddingService to persist and retrieve by profileId
- Add version-scoped VectorSearch with optional versionId filtering
- Add searchMode (auto|keyword|semantic|hybrid) to HybridSearchService
- Update API /context route to load active profile, support searchMode/alpha params
- Extend MCP query-docs tool with searchMode and alpha parameters
- Update settings API to work with embedding_profiles table
- Add comprehensive test coverage for profiles, registry, version scoping

Status: 445/451 tests passing, core feature complete
2026-03-25 19:16:37 +01:00
Giancarmine Salucci
fef6f66930 wip(TRUEREF-0018): commit version-scoped indexing work 2026-03-25 19:03:22 +01:00
Giancarmine Salucci
b9d52405fa docs: update docs, add new features 2026-03-25 15:11:01 +01:00
Giancarmine Salucci
59628dd408 feat(crawler): ignore .gitingore files and folders, fallback to common ignored deps 2026-03-25 15:10:44 +01:00
Giancarmine Salucci
53b3d36ca3 fix(svelte): restore $effect runes, replacing incorrect onMount usage
onMount is Svelte 4 idiom. In Svelte 5 runes mode $effect is the correct
primitive for side effects and it provides additional behaviour onMount
cannot:

- IndexingProgress: $effect re-runs when jobId prop changes, restarting
  the polling loop for the new job. onMount would have missed prop changes.

- search/+page.svelte: $effect with untrack() reads page.url params once
  on mount without tracking the URL as a reactive dependency, preventing
  goto() calls inside searchDocs() from triggering an infinite re-run loop.
  Restores the page store import from $app/state.

- settings/+page.svelte: $effect with no reactive reads in the body runs
  exactly once on mount — equivalent to onMount but idiomatic Svelte 5.

All three verified with svelte-autofixer: no issues.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:34:26 +01:00
Giancarmine Salucci
fd46328a8b chore(git): tighten .gitignore and untrack machine-specific files
- Add *.db-shm and *.db-wal to cover SQLite WAL mode files (*.db alone
  was not sufficient)
- Replace blanket .claude/ ignore with .claude/ + !.claude/rules/ so
  project-level Claude Code rules remain tracked while local settings
  (settings.local.json, projects/, etc.) stay ignored

Machine-specific files removed from tracking in the previous refactor
commit: .claude/settings.local.json, local.db-shm, local.db-wal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:35 +01:00
Giancarmine Salucci
3cd1b132f8 docs: add TRUEREF-0019 feature request for git-native versioning and corporate deployment
Covers two related concerns:

Part 1 — Git-native version indexing: resolve version tags to commit
hashes via git rev-parse, store commit_hash in the versions table, auto-
discover tags on repo registration, extract per-version file trees with
git archive to avoid disturbing the working directory, and support
explicit commitHash overrides in trueref.json.

Part 2 — Corporate deployment support: per-host HTTPS credential helpers
for Bitbucket Server and GitLab, SSH key mounting with Windows permission
fix, CA certificate handling (PEM/DER auto-detection), and the full
docker-compose and entrypoint reference configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:27 +01:00
Giancarmine Salucci
5c738be624 docs: add Docker deployment section to README
Documents the full Docker deployment workflow including docker compose
quickstart, environment variable reference, web-only and MCP-only run
modes, health check endpoints, IDE MCP config snippets (VS Code,
IntelliJ, Claude Code), and mounting local repositories as read-only
volumes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:18 +01:00
Giancarmine Salucci
69743862a7 feat(docker): add Dockerfile, Docker Compose, and deployment entrypoint
Multi-stage Dockerfile produces a lean image with the compiled SvelteKit
app (adapter-node) and the MCP server TypeScript source. A single image
supports two run modes selected via CMD: web (default) and mcp.

- docker-entrypoint.sh handles CA certificate install (PEM/DER auto-detected
  via openssl), SSH key permission fix for Windows-mounted keys, per-host
  HTTPS credential helpers for Bitbucket and GitLab, DB migrations, then
  starts the requested service
- docker-compose.yml runs web on :3000 and the MCP HTTP server on :3001,
  with the MCP container pointed at the web service via internal DNS
- .dockerignore excludes node_modules, build output, .env files, and *.db*

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:12 +01:00
Giancarmine Salucci
a63de39473 fix(svelte): replace $effect with onMount for side effects
$effect runs during SSR and re-runs on every reactive dependency change,
causing polling loops and URL reads to fire at the wrong time. onMount
runs once on the client after first render, which is the correct lifecycle
for polling, URL param reads, and async data loads.

- IndexingProgress: polling loop now starts on mount, not on reactive trigger
- search/+page.svelte: URL param init moved to onMount; use window.location
  directly instead of the page store to avoid reactive re-runs
- settings/+page.svelte: config load and local provider probe moved to onMount

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:04 +01:00
Giancarmine Salucci
215cadf070 refactor: introduce domain model classes and mapper layer
Replace ad-hoc inline row casting (snake_case → camelCase) spread across
services, routes, and the indexing pipeline with explicit model classes
(Repository, IndexingJob, RepositoryVersion, Snippet, SearchResult) and
dedicated mapper classes that own the DB → domain conversion.

- Add src/lib/server/models/ with typed model classes for all domain entities
- Add src/lib/server/mappers/ with mapper classes per entity
- Remove duplicated RawRow interfaces and inline map functions from
  job-queue, repository.service, indexing.pipeline, and all API routes
- Add dtoJsonResponse helper to standardise JSON responses via SvelteKit json()
- Add api-contract.integration.test.ts as a regression baseline

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:29:49 +01:00
Giancarmine Salucci
7994254e23 fix(svelte) fix svelte 2026-03-24 18:41:28 +01:00
Giancarmine Salucci
5f510a2237 fix(UI-0001): map snake_case repo columns to camelCase in libs API response
better-sqlite3 returns raw snake_case column names from SELECT *, so
trust_score, total_snippets etc. were not matching the camelCase keys
(trustScore, totalSnippets) that RepositoryCard.svelte reads. Added a
mapRepo() helper in the GET /api/v1/libs handler to normalise the shape
before JSON serialisation, fixing the trust score and snippet count
display on repository cards.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 18:37:25 +01:00
Giancarmine Salucci
c1e1a9d05e residual fixes 2026-03-24 18:17:42 +01:00
Giancarmine Salucci
c20df6bc97 feat(EMBEDDINGS-0001): add save feedback banners and fix Svelte 5 runes compliance
Replace onMount with $effect for initial config loading. Add auto-dismissing
green success banner (3 s) and persistent red error banner after save attempts.
Remove inline status text in favour of full-width banners with icons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 18:15:52 +01:00
Giancarmine Salucci
06053152d2 fix(pipeline): indexing jobs never started due to missing queue.enqueue() calls
All three trigger-indexing routes were calling service.createIndexingJob()
directly which only inserts the DB record but never calls processNext().
Fixed to route through getQueue().enqueue() so the job queue actually
picks up and runs the job immediately.

Affected routes:
- POST /api/v1/libs (autoIndex on add)
- POST /api/v1/libs/:id/index
- POST /api/v1/libs/:id/versions/:tag/index

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:33:36 +01:00
Giancarmine Salucci
391eb7f411 fix(ui): resolve state_referenced_locally warnings and add folder picker
- Fix Svelte state_referenced_locally warning in +page.svelte and repos/[id]/+page.svelte
  by initializing $state with empty defaults and syncing via $effect
- Add FolderPicker component with server-side filesystem browser
  (single-click to navigate, double-click or "Select This Folder" to confirm)
- Git repos highlighted with orange folder icon and "git" badge
- Add GET /api/v1/fs/browse endpoint listing subdirectories
- Wire FolderPicker into AddRepositoryModal for local source type
- Auto-fills title from the selected folder name

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:17:44 +01:00
Giancarmine Salucci
f91bdbc2bf feat(TRUEREF-0018): implement embedding provider configuration UI
- Full settings page replacing placeholder with embedding provider selector
- Provider presets: OpenAI, Ollama, Azure OpenAI
- Test Connection button via POST /api/v1/settings/embedding/test
- Warning banner for FTS5-only mode when provider=none
- Local model availability probe (@xenova/transformers)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:07:27 +01:00
Giancarmine Salucci
9e3f62e329 feat(TRUEREF-0017): implement incremental re-indexing with checksum diff
- computeDiff classifies files into added/modified/deleted/unchanged buckets
- Only changed and new files are parsed and re-embedded on re-runs
- Deleted files removed atomically from DB
- Progress counts all files including unchanged for accurate reporting
- ~20x speedup for re-indexing large repositories with few changes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:07:20 +01:00
Giancarmine Salucci
22bf4c1014 feat(TRUEREF-0016): implement web UI search explorer
- Two-step search workflow: resolve library then query documentation
- URL state sync (/search?lib=...&q=...)
- LibraryResult, SnippetCard, SearchInput components
- Code/info snippet display with breadcrumbs and token counts
- Copy-as-markdown button for full response

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:07:13 +01:00
Giancarmine Salucci
90d93786a8 feat(TRUEREF-0015): implement web UI repository dashboard
- Repository list with state badges, stats, and action buttons
- Add repository modal for GitHub URLs and local paths
- Live indexing progress bar polling every 2s
- Confirm dialog for destructive actions
- Repository detail page with versions and recent jobs
- Settings page placeholder

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:07:06 +01:00
Giancarmine Salucci
542f4ce66c feat(TRUEREF-0014): implement repository version management
- VersionService with list, add, remove, getByTag, registerFromConfig
- GitHub tag discovery helper for validating tags before indexing
- Version ID format: /owner/repo/tag (e.g. /facebook/react/v18.3.0)
- GET/POST /api/v1/libs/:id/versions
- DELETE /api/v1/libs/:id/versions/:tag
- POST /api/v1/libs/:id/versions/:tag/index

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:59 +01:00
Giancarmine Salucci
f31db2db2c feat(TRUEREF-0013): implement trueref.json config file support
- Lenient parser for trueref.json and context7.json (trueref.json takes precedence)
- Validates folders, excludeFolders, excludeFiles, rules, previousVersions
- Stores config in repository_configs table
- JSON Schema served at GET /api/v1/schema/trueref-config.json for IDE validation
- Rules injected at top of every query-docs response

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:50 +01:00
Giancarmine Salucci
b3c0849849 feat(TRUEREF-0011-0012): implement MCP server with stdio and HTTP transports
- resolve-library-id and query-docs tools with context7-identical schemas
- stdio transport for Claude Code, Cursor, and other MCP clients
- HTTP transport via StreamableHTTPServerTransport on configurable port
- /mcp endpoint with CORS and /ping health check
- mcp:start and mcp:http npm scripts
- Claude Code rule file at .claude/rules/trueref.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:41 +01:00
Giancarmine Salucci
21f6acbfa3 feat(TRUEREF-0009-0010): implement indexing pipeline job queue and public REST API
- SQLite-backed job queue with sequential processing and startup recovery
- Atomic snippet replacement in single transaction
- context7-compatible GET /api/v1/libs/search and GET /api/v1/context
- Token budget limiting and JSON/txt response format support
- CORS headers on all API routes via SvelteKit handle hook
- Library ID parser supporting /owner/repo and /owner/repo/version

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:35 +01:00
Giancarmine Salucci
d3d577a2e2 feat(TRUEREF-0008): implement hybrid semantic search with RRF
- Cosine similarity vector search over stored embeddings
- Reciprocal Rank Fusion (K=60) combining FTS5 + vector rankings
- Configurable alpha weight between keyword and semantic search
- Graceful degradation to FTS5-only when no embedding provider configured

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:25 +01:00
Giancarmine Salucci
33bdf30709 feat(TRUEREF-0006): implement SQLite FTS5 full-text search engine
- BM25 ranking via SQLite FTS5 bm25() function
- Query preprocessor with wildcard expansion and special char escaping
- Library search with composite scoring (name match, trust score, snippet count)
- Trust score computation from stars, coverage, and source type
- Response formatters for library and snippet results

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:18 +01:00
Giancarmine Salucci
f6be3cfd47 feat(TRUEREF-0005): implement document parser and chunker
- Markdown parser with heading-based section splitting and code block extraction
- Code file parser with regex boundary detection for 10+ languages
- Sliding window chunker with configurable token limits and overlap
- Language detection from file extensions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:12 +01:00
Giancarmine Salucci
1c15d6c474 feat(TRUEREF-0003-0004): implement GitHub and local filesystem crawlers
- GitHub crawler with rate limiting, semaphore concurrency, retry logic
- File filtering by extension, size, and trueref.json rules
- Local filesystem crawler with SHA-256 checksums and progress callbacks
- Shared types and file filter logic between both crawlers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:07 +01:00
Giancarmine Salucci
cb253ffe98 feat(TRUEREF-0011): implement MCP server with stdio transport
Adds a Model Context Protocol server that exposes resolve-library-id
and query-docs tools via stdio, with tool schemas identical to context7
for drop-in compatibility with Claude Code, Cursor, and Zed.

- src/mcp/index.ts — server entry point (io.github.trueref/trueref)
- src/mcp/client.ts — HTTP client for TrueRef REST API (TRUEREF_API_URL)
- src/mcp/tools/resolve-library-id.ts — library search tool handler
- src/mcp/tools/query-docs.ts — documentation retrieval tool handler
- src/mcp/index.test.ts — integration tests spawning real server subprocess
- .claude/rules/trueref.md — Claude Code rule file for MCP usage
- package.json: mcp:start script using tsx

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:32:20 +01:00
Giancarmine Salucci
956b2a3a62 feat(TRUEREF-0009): implement indexing pipeline and job queue
Implements the end-to-end indexing pipeline with a SQLite-backed job
queue, startup recovery, and REST API endpoints for job status.

- IndexingPipeline: orchestrates crawl → parse → atomic replace → embed
  → repo stats update with progress tracking at each stage
- JobQueue: sequential SQLite-backed queue (no external broker), deduplicates
  active jobs per repository, drains queued jobs on startup
- startup.ts: stale job recovery (running→failed), repo state reset, singleton
  initialization wired from hooks.server.ts
- GET /api/v1/jobs with repositoryId/status/limit filtering
- GET /api/v1/jobs/[id] single job lookup
- hooks.server.ts: initializes DB and pipeline on server start
- 18 unit tests covering queue, pipeline stages, recovery, and atomicity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:22:20 +01:00
Giancarmine Salucci
bf4caf5e3b feat(TRUEREF-0007): implement pluggable embedding generation and vector storage
Add EmbeddingProvider interface with OpenAI-compatible, local (optional
@xenova/transformers via dynamic import), and Noop (FTS5-only fallback)
implementations. EmbeddingService batches requests and persists Float32Array
blobs to snippet_embeddings. GET/PUT /api/v1/settings/embedding endpoints
read and write embedding config from the settings table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:07:26 +01:00
Giancarmine Salucci
3d1bef5003 feat(TRUEREF-0002): implement repository management service and REST API
Add RepositoryService with full CRUD, ID resolution helpers, input
validation, six SvelteKit API routes (GET/POST /api/v1/libs,
GET/PATCH/DELETE /api/v1/libs/:id, POST /api/v1/libs/:id/index), and
37 unit tests covering all service operations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 17:43:06 +01:00
Giancarmine Salucci
f57b622505 feat(TRUEREF-0001): implement complete database schema and core data models
Define all SQLite tables via Drizzle ORM (repositories, repository_versions,
documents, snippets, snippet_embeddings, indexing_jobs, repository_configs,
settings), generate the initial migration, create FTS5 virtual table and
sync triggers in fts.sql, add shared TypeScript types in src/lib/types.ts,
and write 21 unit tests covering insertions, cascade deletes, FK constraints,
blob storage, JSON fields, and FTS5 trigger behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 17:18:01 +01:00
Giancarmine Salucci
18437dfa7c chore: initial project scaffold 2026-03-22 17:08:15 +01:00