64 Commits

Author SHA1 Message Date
Giancarmine Salucci
542f4ce66c feat(TRUEREF-0014): implement repository version management
- VersionService with list, add, remove, getByTag, registerFromConfig
- GitHub tag discovery helper for validating tags before indexing
- Version ID format: /owner/repo/tag (e.g. /facebook/react/v18.3.0)
- GET/POST /api/v1/libs/:id/versions
- DELETE /api/v1/libs/:id/versions/:tag
- POST /api/v1/libs/:id/versions/:tag/index

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:59 +01:00
Giancarmine Salucci
f31db2db2c feat(TRUEREF-0013): implement trueref.json config file support
- Lenient parser for trueref.json and context7.json (trueref.json takes precedence)
- Validates folders, excludeFolders, excludeFiles, rules, previousVersions
- Stores config in repository_configs table
- JSON Schema served at GET /api/v1/schema/trueref-config.json for IDE validation
- Rules injected at top of every query-docs response

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:50 +01:00
Giancarmine Salucci
b3c0849849 feat(TRUEREF-0011-0012): implement MCP server with stdio and HTTP transports
- resolve-library-id and query-docs tools with context7-identical schemas
- stdio transport for Claude Code, Cursor, and other MCP clients
- HTTP transport via StreamableHTTPServerTransport on configurable port
- /mcp endpoint with CORS and /ping health check
- mcp:start and mcp:http npm scripts
- Claude Code rule file at .claude/rules/trueref.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:41 +01:00
Giancarmine Salucci
21f6acbfa3 feat(TRUEREF-0009-0010): implement indexing pipeline job queue and public REST API
- SQLite-backed job queue with sequential processing and startup recovery
- Atomic snippet replacement in single transaction
- context7-compatible GET /api/v1/libs/search and GET /api/v1/context
- Token budget limiting and JSON/txt response format support
- CORS headers on all API routes via SvelteKit handle hook
- Library ID parser supporting /owner/repo and /owner/repo/version

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:35 +01:00
Giancarmine Salucci
d3d577a2e2 feat(TRUEREF-0008): implement hybrid semantic search with RRF
- Cosine similarity vector search over stored embeddings
- Reciprocal Rank Fusion (K=60) combining FTS5 + vector rankings
- Configurable alpha weight between keyword and semantic search
- Graceful degradation to FTS5-only when no embedding provider configured

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:25 +01:00
Giancarmine Salucci
33bdf30709 feat(TRUEREF-0006): implement SQLite FTS5 full-text search engine
- BM25 ranking via SQLite FTS5 bm25() function
- Query preprocessor with wildcard expansion and special char escaping
- Library search with composite scoring (name match, trust score, snippet count)
- Trust score computation from stars, coverage, and source type
- Response formatters for library and snippet results

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:18 +01:00
Giancarmine Salucci
f6be3cfd47 feat(TRUEREF-0005): implement document parser and chunker
- Markdown parser with heading-based section splitting and code block extraction
- Code file parser with regex boundary detection for 10+ languages
- Sliding window chunker with configurable token limits and overlap
- Language detection from file extensions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:12 +01:00
Giancarmine Salucci
1c15d6c474 feat(TRUEREF-0003-0004): implement GitHub and local filesystem crawlers
- GitHub crawler with rate limiting, semaphore concurrency, retry logic
- File filtering by extension, size, and trueref.json rules
- Local filesystem crawler with SHA-256 checksums and progress callbacks
- Shared types and file filter logic between both crawlers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 09:06:07 +01:00
Giancarmine Salucci
cb253ffe98 feat(TRUEREF-0011): implement MCP server with stdio transport
Adds a Model Context Protocol server that exposes resolve-library-id
and query-docs tools via stdio, with tool schemas identical to context7
for drop-in compatibility with Claude Code, Cursor, and Zed.

- src/mcp/index.ts — server entry point (io.github.trueref/trueref)
- src/mcp/client.ts — HTTP client for TrueRef REST API (TRUEREF_API_URL)
- src/mcp/tools/resolve-library-id.ts — library search tool handler
- src/mcp/tools/query-docs.ts — documentation retrieval tool handler
- src/mcp/index.test.ts — integration tests spawning real server subprocess
- .claude/rules/trueref.md — Claude Code rule file for MCP usage
- package.json: mcp:start script using tsx

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:32:20 +01:00
Giancarmine Salucci
956b2a3a62 feat(TRUEREF-0009): implement indexing pipeline and job queue
Implements the end-to-end indexing pipeline with a SQLite-backed job
queue, startup recovery, and REST API endpoints for job status.

- IndexingPipeline: orchestrates crawl → parse → atomic replace → embed
  → repo stats update with progress tracking at each stage
- JobQueue: sequential SQLite-backed queue (no external broker), deduplicates
  active jobs per repository, drains queued jobs on startup
- startup.ts: stale job recovery (running→failed), repo state reset, singleton
  initialization wired from hooks.server.ts
- GET /api/v1/jobs with repositoryId/status/limit filtering
- GET /api/v1/jobs/[id] single job lookup
- hooks.server.ts: initializes DB and pipeline on server start
- 18 unit tests covering queue, pipeline stages, recovery, and atomicity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:22:20 +01:00
Giancarmine Salucci
bf4caf5e3b feat(TRUEREF-0007): implement pluggable embedding generation and vector storage
Add EmbeddingProvider interface with OpenAI-compatible, local (optional
@xenova/transformers via dynamic import), and Noop (FTS5-only fallback)
implementations. EmbeddingService batches requests and persists Float32Array
blobs to snippet_embeddings. GET/PUT /api/v1/settings/embedding endpoints
read and write embedding config from the settings table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:07:26 +01:00
Giancarmine Salucci
3d1bef5003 feat(TRUEREF-0002): implement repository management service and REST API
Add RepositoryService with full CRUD, ID resolution helpers, input
validation, six SvelteKit API routes (GET/POST /api/v1/libs,
GET/PATCH/DELETE /api/v1/libs/:id, POST /api/v1/libs/:id/index), and
37 unit tests covering all service operations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 17:43:06 +01:00
Giancarmine Salucci
f57b622505 feat(TRUEREF-0001): implement complete database schema and core data models
Define all SQLite tables via Drizzle ORM (repositories, repository_versions,
documents, snippets, snippet_embeddings, indexing_jobs, repository_configs,
settings), generate the initial migration, create FTS5 virtual table and
sync triggers in fts.sql, add shared TypeScript types in src/lib/types.ts,
and write 21 unit tests covering insertions, cascade deletes, FK constraints,
blob storage, JSON fields, and FTS5 trigger behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 17:18:01 +01:00
Giancarmine Salucci
18437dfa7c chore: initial project scaffold 2026-03-22 17:08:15 +01:00