chore(FEEDBACK-0001): linting

This commit is contained in:
Giancarmine Salucci
2026-03-27 02:23:01 +01:00
parent 16436bfab2
commit 5a3c27224d
102 changed files with 5108 additions and 4976 deletions

View File

@@ -17,6 +17,7 @@ The core use case is enabling AI coding assistants (Claude Code, Cursor, Zed, et
## 2. Problem Statement
### 2.1 Context7's Limitations
- The indexing and crawling backend is entirely private and closed-source.
- Only public libraries already in the context7.com catalog are available.
- Private, internal, or niche repositories cannot be added.
@@ -24,6 +25,7 @@ The core use case is enabling AI coding assistants (Claude Code, Cursor, Zed, et
- No way to self-host for air-gapped or compliance-constrained environments.
### 2.2 The Gap
Teams with internal SDKs, private libraries, proprietary documentation, or a need for data sovereignty have no tooling that provides context7-equivalent LLM documentation retrieval.
---
@@ -31,6 +33,7 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
## 3. Goals & Non-Goals
### Goals
- Replicate all context7 capabilities: library search, documentation retrieval, MCP tools (`resolve-library-id`, `query-docs`).
- Support both GitHub-hosted and local filesystem repositories.
- Provide a full indexing pipeline: crawl → parse → chunk → embed → store → query.
@@ -42,6 +45,7 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
- Self-hostable with minimal dependencies (SQLite-first, no external vector DB required).
### Non-Goals (v1)
- Authentication & authorization (deferred to a future version).
- Skill generation (context7 CLI skill feature).
- Multi-tenant SaaS mode.
@@ -54,9 +58,11 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
## 4. Users & Personas
### Primary: The Developer / Tech Lead
Configures TrueRef, adds repositories, integrates the MCP server with their AI coding assistant. Technical, comfortable with CLI and config files.
### Secondary: The AI Coding Assistant
The "user" at query time. Calls `resolve-library-id` and `query-docs` via MCP to retrieve documentation snippets for code generation.
---
@@ -100,25 +106,27 @@ The "user" at query time. Calls `resolve-library-id` and `query-docs` via MCP to
```
### Technology Stack
| Layer | Technology |
|-------|-----------|
| Framework | SvelteKit (Node adapter) |
| Language | TypeScript |
| Database | SQLite via better-sqlite3 + drizzle-orm |
| Full-Text Search | SQLite FTS5 |
| Vector Search | SQLite `sqlite-vec` extension (cosine similarity) |
| Embeddings | Pluggable: local (transformers.js / ONNX) or OpenAI-compatible API |
| MCP Protocol | `@modelcontextprotocol/sdk` |
| HTTP | SvelteKit API routes + optional standalone MCP HTTP server |
| CSS | TailwindCSS v4 |
| Testing | Vitest |
| Linting | ESLint + Prettier |
| Layer | Technology |
| ---------------- | ------------------------------------------------------------------ |
| Framework | SvelteKit (Node adapter) |
| Language | TypeScript |
| Database | SQLite via better-sqlite3 + drizzle-orm |
| Full-Text Search | SQLite FTS5 |
| Vector Search | SQLite `sqlite-vec` extension (cosine similarity) |
| Embeddings | Pluggable: local (transformers.js / ONNX) or OpenAI-compatible API |
| MCP Protocol | `@modelcontextprotocol/sdk` |
| HTTP | SvelteKit API routes + optional standalone MCP HTTP server |
| CSS | TailwindCSS v4 |
| Testing | Vitest |
| Linting | ESLint + Prettier |
---
## 6. Data Model
### 6.1 Repositories
A `Repository` is the top-level entity. It maps to a GitHub repo or local directory.
```
@@ -141,6 +149,7 @@ Repository {
```
### 6.2 Repository Versions
```
RepositoryVersion {
id TEXT PRIMARY KEY
@@ -153,6 +162,7 @@ RepositoryVersion {
```
### 6.3 Documents (parsed files)
```
Document {
id TEXT PRIMARY KEY
@@ -169,6 +179,7 @@ Document {
```
### 6.4 Snippets (indexed chunks)
```
Snippet {
id TEXT PRIMARY KEY
@@ -186,6 +197,7 @@ Snippet {
```
### 6.5 Indexing Jobs
```
IndexingJob {
id TEXT PRIMARY KEY
@@ -203,6 +215,7 @@ IndexingJob {
```
### 6.6 Repository Configuration (`trueref.json`)
```
RepositoryConfig {
repositoryId TEXT FK → Repository
@@ -221,15 +234,19 @@ RepositoryConfig {
## 7. Core Features
### F1: Repository Management
Add, remove, update, and list repositories. Support GitHub (public/private via token) and local filesystem sources. Trigger indexing on demand or on schedule.
### F2: GitHub Crawler
Fetch repository file trees via GitHub Trees API. Download file contents. Respect `trueref.json` include/exclude rules. Support rate limiting and incremental re-indexing (checksum-based).
### F3: Local Filesystem Crawler
Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for file changes (optional).
### F4: Document Parser & Chunker
- Parse Markdown files into sections (heading-based splitting).
- Extract code blocks from Markdown.
- Parse standalone code files into function/class-level chunks.
@@ -237,16 +254,19 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
- Produce structured `Snippet` records (type: "code" or "info").
### F5: Embedding & Vector Storage
- Generate embeddings for each snippet using a pluggable embeddings backend.
- Store embeddings as binary blobs in SQLite (sqlite-vec).
- Support fallback to FTS5-only search when no embedding provider is configured.
### F6: Semantic Search Engine
- Hybrid search: vector similarity + FTS5 keyword matching (BM25) with reciprocal rank fusion.
- Query-time retrieval: given `libraryId + query`, return ranked snippets.
- Library search: given `libraryName + query`, return matching repositories.
### F7: REST API (`/api/v1/*`)
- `GET /api/v1/libs/search?query=&libraryName=` — search libraries (context7-compatible)
- `GET /api/v1/context?query=&libraryId=&type=json|txt` — fetch documentation
- `GET /api/v1/libs` — list all indexed libraries
@@ -256,12 +276,14 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
- `GET /api/v1/jobs/:id` — get indexing job status
### F8: MCP Server
- Tool: `resolve-library-id` — search for libraries by name
- Tool: `query-docs` — fetch documentation by libraryId + query
- Transport: stdio (primary), HTTP (optional)
- Compatible with Claude Code, Cursor, and other MCP-aware tools
### F9: Web UI — Repository Dashboard
- List all repositories with status, snippet count, last indexed date
- Add/remove repositories (GitHub URL or local path)
- Trigger re-indexing
@@ -269,23 +291,27 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
- View repository config (`trueref.json`)
### F10: Web UI — Search Explorer
- Interactive search interface (resolve library → query docs)
- Preview snippets with syntax highlighting
- View raw document content
### F11: `trueref.json` Config Support
- Parse `trueref.json` from repo root (or `context7.json` for compatibility)
- Apply `folders`, `excludeFolders`, `excludeFiles` during crawling
- Inject `rules` into LLM context alongside snippets
- Support `previousVersions` for versioned documentation
### F12: Indexing Pipeline & Job Queue
- SQLite-backed job queue (no external message broker required)
- Sequential processing with progress tracking
- Error recovery and retry logic
- Incremental re-indexing using file checksums
### F13: Version Support
- Index specific git tags/branches per repository
- Serve version-specific context when libraryId includes version (`/owner/repo/v1.2.3`)
- UI for managing available versions
@@ -296,12 +322,13 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
TrueRef's REST API mirrors context7's `/api/v2/*` interface to allow drop-in compatibility:
| context7 Endpoint | TrueRef Endpoint | Notes |
|-------------------|-----------------|-------|
| `GET /api/v2/libs/search` | `GET /api/v1/libs/search` | Same query params |
| `GET /api/v2/context` | `GET /api/v1/context` | Same query params, same response shape |
| context7 Endpoint | TrueRef Endpoint | Notes |
| ------------------------- | ------------------------- | -------------------------------------- |
| `GET /api/v2/libs/search` | `GET /api/v1/libs/search` | Same query params |
| `GET /api/v2/context` | `GET /api/v1/context` | Same query params, same response shape |
The MCP tool names and input schemas are identical:
- `resolve-library-id` with `libraryName` + `query`
- `query-docs` with `libraryId` + `query`
@@ -312,20 +339,24 @@ Library IDs follow the same convention: `/owner/repo` or `/owner/repo/version`.
## 9. Non-Functional Requirements
### Performance
- Library search: < 200ms p99
- Documentation retrieval: < 500ms p99 for 20 snippets
- Indexing throughput: > 1,000 files/minute (GitHub API rate-limited)
### Reliability
- Failed indexing jobs must not corrupt existing indexed data
- Atomic snippet replacement during re-indexing
### Portability
- Single SQLite file for all data
- Runs on Linux, macOS, Windows (Node.js 20+)
- No required external services beyond optional embedding API
### Scalability (v1 constraints)
- Designed for single-node deployment
- SQLite suitable for up to ~500 repositories, ~500k snippets
@@ -333,26 +364,26 @@ Library IDs follow the same convention: `/owner/repo` or `/owner/repo/version`.
## 10. Milestones & Feature Order
| ID | Feature | Priority | Depends On |
|----|---------|----------|-----------|
| TRUEREF-0001 | Database schema & core data models | P0 | — |
| TRUEREF-0002 | Repository management service & REST API | P0 | TRUEREF-0001 |
| TRUEREF-0003 | GitHub repository crawler | P0 | TRUEREF-0001 |
| TRUEREF-0004 | Local filesystem crawler | P1 | TRUEREF-0001 |
| TRUEREF-0005 | Document parser & chunker | P0 | TRUEREF-0001 |
| TRUEREF-0006 | SQLite FTS5 full-text search | P0 | TRUEREF-0005 |
| TRUEREF-0007 | Embedding generation & vector storage | P1 | TRUEREF-0005 |
| TRUEREF-0008 | Hybrid semantic search engine | P1 | TRUEREF-0006, TRUEREF-0007 |
| TRUEREF-0009 | Indexing pipeline & job queue | P0 | TRUEREF-0003, TRUEREF-0005 |
| TRUEREF-0010 | REST API (search + context endpoints) | P0 | TRUEREF-0006, TRUEREF-0009 |
| TRUEREF-0011 | MCP server (stdio transport) | P0 | TRUEREF-0010 |
| TRUEREF-0012 | MCP server (HTTP transport) | P1 | TRUEREF-0011 |
| TRUEREF-0013 | `trueref.json` config file support | P0 | TRUEREF-0003 |
| TRUEREF-0014 | Repository version management | P1 | TRUEREF-0003 |
| TRUEREF-0015 | Web UI — repository dashboard | P1 | TRUEREF-0002, TRUEREF-0009 |
| TRUEREF-0016 | Web UI — search explorer | P2 | TRUEREF-0010, TRUEREF-0015 |
| TRUEREF-0017 | Incremental re-indexing (checksum diff) | P1 | TRUEREF-0009 |
| TRUEREF-0018 | Embedding provider configuration UI | P2 | TRUEREF-0007, TRUEREF-0015 |
| ID | Feature | Priority | Depends On |
| ------------ | ---------------------------------------- | -------- | -------------------------- |
| TRUEREF-0001 | Database schema & core data models | P0 | — |
| TRUEREF-0002 | Repository management service & REST API | P0 | TRUEREF-0001 |
| TRUEREF-0003 | GitHub repository crawler | P0 | TRUEREF-0001 |
| TRUEREF-0004 | Local filesystem crawler | P1 | TRUEREF-0001 |
| TRUEREF-0005 | Document parser & chunker | P0 | TRUEREF-0001 |
| TRUEREF-0006 | SQLite FTS5 full-text search | P0 | TRUEREF-0005 |
| TRUEREF-0007 | Embedding generation & vector storage | P1 | TRUEREF-0005 |
| TRUEREF-0008 | Hybrid semantic search engine | P1 | TRUEREF-0006, TRUEREF-0007 |
| TRUEREF-0009 | Indexing pipeline & job queue | P0 | TRUEREF-0003, TRUEREF-0005 |
| TRUEREF-0010 | REST API (search + context endpoints) | P0 | TRUEREF-0006, TRUEREF-0009 |
| TRUEREF-0011 | MCP server (stdio transport) | P0 | TRUEREF-0010 |
| TRUEREF-0012 | MCP server (HTTP transport) | P1 | TRUEREF-0011 |
| TRUEREF-0013 | `trueref.json` config file support | P0 | TRUEREF-0003 |
| TRUEREF-0014 | Repository version management | P1 | TRUEREF-0003 |
| TRUEREF-0015 | Web UI — repository dashboard | P1 | TRUEREF-0002, TRUEREF-0009 |
| TRUEREF-0016 | Web UI — search explorer | P2 | TRUEREF-0010, TRUEREF-0015 |
| TRUEREF-0017 | Incremental re-indexing (checksum diff) | P1 | TRUEREF-0009 |
| TRUEREF-0018 | Embedding provider configuration UI | P2 | TRUEREF-0007, TRUEREF-0015 |
---