From 5c738be624278aa6e17b68860f2d47a0782d3e92 Mon Sep 17 00:00:00 2001 From: Giancarmine Salucci Date: Wed, 25 Mar 2026 14:30:18 +0100 Subject: [PATCH] docs: add Docker deployment section to README Documents the full Docker deployment workflow including docker compose quickstart, environment variable reference, web-only and MCP-only run modes, health check endpoints, IDE MCP config snippets (VS Code, IntelliJ, Claude Code), and mounting local repositories as read-only volumes. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 800 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 781 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index e7e5a7b..d2336a8 100644 --- a/README.md +++ b/README.md @@ -1,42 +1,804 @@ -# sv +# TrueRef -Everything you need to build a Svelte project, powered by [`sv`](https://github.com/sveltejs/cli). +TrueRef is a self-hosted documentation retrieval platform for AI coding assistants. -## Creating a project +It indexes public or private repositories, stores searchable documentation and code snippets locally, exposes a context7-compatible REST API, and ships an MCP server that lets tools like VS Code, IntelliJ, and Claude Code query your indexed libraries with the same two-step flow used by context7: -If you're seeing this, you've probably already done this step. Congrats! +1. `resolve-library-id` +2. `query-docs` + +The goal is straightforward: give your assistants accurate, current, version-aware documentation from repositories you control. + +## What TrueRef does + +- Indexes GitHub repositories and local folders. +- Parses Markdown and source files into searchable snippets. +- Stores metadata in SQLite. +- Supports keyword search out of the box with SQLite FTS5. +- Supports semantic and hybrid search when an embedding provider is configured. +- Exposes REST endpoints for library discovery and documentation retrieval. +- Exposes an MCP server over stdio and HTTP for AI clients. +- Provides a SvelteKit web UI for repository management, search, indexing jobs, and embedding settings. +- Supports repository-level configuration through `trueref.json` or `context7.json`. + +## Project status + +TrueRef is under active development. The current codebase already includes: + +- repository management +- indexing jobs and recovery on restart +- local and GitHub crawling +- version registration support +- context7-compatible API endpoints +- MCP stdio and HTTP transports +- configurable embedding providers + +## Architecture + +TrueRef is organized into four main layers: + +1. Web UI + SvelteKit application for adding repositories, monitoring indexing, searching content, and configuring embeddings. +2. REST API + Endpoints under `/api/v1/*` for repository management, search, schema discovery, job status, and settings. +3. Indexing pipeline + Crawlers, parsers, chunking logic, snippet storage, and optional embedding generation. +4. MCP server + A thin compatibility layer that forwards `resolve-library-id` and `query-docs` requests to the TrueRef REST API. + +At runtime, the app uses SQLite via `better-sqlite3` and Drizzle, plus optional embedding providers for semantic retrieval. + +## Tech stack + +- TypeScript +- SvelteKit +- SQLite (`better-sqlite3`) +- Drizzle ORM / Drizzle Kit +- Tailwind CSS +- Vitest +- Model Context Protocol SDK (`@modelcontextprotocol/sdk`) + +## Core concepts + +### Libraries + +Each indexed repository becomes a library with an ID such as `/facebook/react`. + +### Versions + +Libraries can register version tags. Queries can target a specific version by using a library ID such as `/facebook/react/v18.3.0`. + +### Snippets + +Documents are split into code and info snippets. These snippets are what search and MCP responses return. + +### Rules + +Repository rules defined in `trueref.json` are prepended to `query-docs` responses so assistants get usage constraints along with the retrieved content. + +## Requirements + +- Node.js 20+ +- npm +- SQLite-compatible filesystem access + +Optional: + +- an OpenAI-compatible embedding API if you want semantic search +- `@xenova/transformers` if you want local embedding generation + +## Getting started + +### 1. Install dependencies ```sh -# create a new project -npx sv create my-app +npm install ``` -To recreate this project with the same configuration: +### 2. Configure the database + +TrueRef requires `DATABASE_URL`. + +Example: ```sh -# recreate this project -npx sv@0.12.8 create --template minimal --types ts --add prettier eslint vitest="usages:unit,component" tailwindcss="plugins:none" sveltekit-adapter="adapter:node" drizzle="database:sqlite+sqlite:better-sqlite3" mcp="ide:claude-code,vscode,other+setup:remote" --install npm trueref +export DATABASE_URL="$PWD/trueref.db" ``` -## Developing +You can place the same value in your shell profile or your local environment loading mechanism. -Once you've created a project and installed dependencies with `npm install` (or `pnpm install` or `yarn`), start a development server: +### 3. Create or migrate the schema + +For a fresh local setup: + +```sh +npm run db:migrate +``` + +During development, if you are iterating on schema changes, `db:push` is also available: + +```sh +npm run db:push +``` + +### 4. Start the web app ```sh npm run dev - -# or start the server and open the app in a new browser tab -npm run dev -- --open ``` -## Building +By default, the app is served by Vite on `http://localhost:5173`. -To create a production version of your app: +## Local development workflow + +Typical loop: + +1. Start the app with `npm run dev`. +2. Open the UI in the browser. +3. Add a repository from the Repositories page. +4. Wait for indexing to finish. +5. Test retrieval either in the Search page, through the REST API, or through MCP. + +## Web UI usage + +The current UI covers three main workflows. + +### Repositories + +Use the main page to: + +- add a GitHub repository +- add a local folder +- trigger re-indexing +- delete an indexed repository +- monitor active indexing jobs + +### Search + +Use the Search page to: + +- search for a library by name +- query documentation for a chosen library +- inspect the text returned by the context endpoint + +### Settings + +Use the Settings page to configure embeddings. + +Supported modes: + +- `none`: keyword search only +- `openai`: any OpenAI-compatible embeddings endpoint +- `local`: local ONNX model via `@xenova/transformers` + +If no embedding provider is configured, TrueRef still works with FTS5-only search. + +## Repository configuration + +TrueRef supports a repository-local config file named `trueref.json`. + +For compatibility with existing context7-style repositories, `context7.json` is also supported. + +### What the config controls + +- project display title +- project description +- included folders +- excluded folders +- excluded file names +- assistant-facing usage rules +- previously released versions + +### Example `trueref.json` + +```json +{ + "$schema": "http://localhost:5173/api/v1/schema/trueref-config.json", + "projectTitle": "My Internal SDK", + "description": "Internal SDK for billing, auth, and event ingestion.", + "folders": ["src/", "docs/"], + "excludeFolders": ["tests/", "fixtures/", "node_modules/"], + "excludeFiles": ["CHANGELOG.md"], + "rules": [ + "Prefer named imports over wildcard imports.", + "Use the async client API for all network calls." + ], + "previousVersions": [ + { + "tag": "v1.2.3", + "title": "Version 1.2.3" + } + ] +} +``` + +### JSON schema + +You can point your editor to the live schema served by TrueRef: + +```text +http://localhost:5173/api/v1/schema/trueref-config.json +``` + +That enables validation and autocomplete in editors that support JSON Schema references. + +## REST API + +TrueRef exposes a context7-compatible API under `/api/v1`. + +### Library search + +Find candidate library IDs. + +```http +GET /api/v1/libs/search?libraryName=react&query=hooks&type=json +``` + +Example: ```sh -npm run build +curl "http://localhost:5173/api/v1/libs/search?libraryName=react&query=hooks&type=json" ``` -You can preview the production build with `npm run preview`. +### Documentation retrieval -> To deploy your app, you may need to install an [adapter](https://svelte.dev/docs/kit/adapters) for your target environment. +Fetch snippets for a specific library ID. + +```http +GET /api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt +``` + +Example: + +```sh +curl "http://localhost:5173/api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt" +``` + +### Repository management + +List repositories: + +```sh +curl "http://localhost:5173/api/v1/libs" +``` + +Add a local repository: + +```sh +curl -X POST "http://localhost:5173/api/v1/libs" \ + -H "content-type: application/json" \ + -d '{ + "source": "local", + "sourceUrl": "/absolute/path/to/my-library", + "title": "My Library" + }' +``` + +Add a GitHub repository: + +```sh +curl -X POST "http://localhost:5173/api/v1/libs" \ + -H "content-type: application/json" \ + -d '{ + "source": "github", + "sourceUrl": "https://github.com/facebook/react", + "title": "React" + }' +``` + +Trigger re-indexing: + +```sh +curl -X POST "http://localhost:5173/api/v1/libs/%2Ffacebook%2Freact/index" +``` + +Check job status: + +```sh +curl "http://localhost:5173/api/v1/jobs" +curl "http://localhost:5173/api/v1/jobs/" +``` + +### Response formats + +The two search endpoints support: + +- `type=json` for structured consumption +- `type=txt` for direct LLM injection + +## Search behavior + +### Without embeddings + +TrueRef uses keyword search with SQLite FTS5. + +This is the simplest setup and requires no external model or API key. + +### With embeddings + +TrueRef generates embeddings for snippets and uses hybrid retrieval: + +- vector similarity +- keyword matching +- ranking fusion + +This improves recall for conceptual and natural-language questions. + +## Embedding configuration + +Embeddings are configured through the UI or through the embedding settings API. + +### OpenAI-compatible provider + +Provide: + +- base URL +- API key +- model name +- optional dimensions override + +### Local provider + +Install: + +```sh +npm install @xenova/transformers +``` + +Then select `local` in Settings. + +### Disable embeddings + +Set the provider to `none` to use FTS5-only retrieval. + +## MCP server + +TrueRef includes an MCP server in `src/mcp/index.ts`. + +It exposes two tools: + +- `resolve-library-id` +- `query-docs` + +The tool names and argument shapes intentionally mirror context7 so existing workflows can switch over with minimal changes. + +### Environment variables + +The MCP server uses: + +- `TRUEREF_API_URL` + Base URL of the TrueRef web app. Default: `http://localhost:5173` +- `PORT` + Used only for HTTP transport. Default: `3001` + +### Start MCP over stdio + +```sh +npm run mcp:start +``` + +This is appropriate when the client launches the server as a local subprocess. + +### Start MCP over HTTP + +```sh +npm run mcp:http +``` + +That starts a streamable HTTP MCP endpoint at: + +```text +http://localhost:3001/mcp +``` + +Health check: + +```text +http://localhost:3001/ping +``` + +### Recommended topology + +For local development: + +1. run the web app on `http://localhost:5173` +2. run the MCP HTTP server on `http://localhost:3001/mcp` +3. point your editor or assistant client at the MCP server + +If your client supports local stdio servers well, you can also skip the HTTP transport and let the client run `npm run mcp:start` directly. + +## Using TrueRef MCP in VS Code + +VS Code supports MCP servers through `mcp.json`. + +Official docs support both workspace-level `.vscode/mcp.json` and user-profile `mcp.json`. + +### Option A: use the HTTP transport + +1. Start the app: + +```sh +npm run dev +``` + +2. Start the MCP HTTP server: + +```sh +npm run mcp:http +``` + +3. Create `.vscode/mcp.json`: + +```json +{ + "servers": { + "trueref": { + "type": "http", + "url": "http://localhost:3001/mcp" + } + } +} +``` + +4. In VS Code, trust and start the server when prompted. +5. Open Chat and ask a library question that should use TrueRef. + +### Option B: let VS Code spawn the stdio server + +```json +{ + "servers": { + "trueref": { + "command": "npm", + "args": ["run", "mcp:start"], + "env": { + "TRUEREF_API_URL": "http://localhost:5173" + } + } + } +} +``` + +Notes: + +- Workspace configuration is the right choice when the MCP server should run inside the project and be shared with the team. +- User-profile configuration is better if you want one TrueRef client available across many workspaces. +- In VS Code, you can manage server lifecycle from the Command Palette with the MCP commands. + +## Using TrueRef MCP in IntelliJ IDEA + +JetBrains AI Assistant supports MCP server connections from the IDE settings UI. + +Path: + +```text +Settings | Tools | AI Assistant | Model Context Protocol (MCP) +``` + +### HTTP setup + +1. Start the app: + +```sh +npm run dev +``` + +2. Start the MCP HTTP server: + +```sh +npm run mcp:http +``` + +3. In IntelliJ IDEA, add a new MCP server and use this JSON configuration: + +```json +{ + "mcpServers": { + "trueref": { + "type": "http", + "url": "http://localhost:3001/mcp" + } + } +} +``` + +4. Apply the configuration and verify the status shows the server as connected. + +### Stdio setup + +If you prefer to let IntelliJ launch the server process directly, use: + +```json +{ + "mcpServers": { + "trueref": { + "command": "npm", + "args": ["run", "mcp:start"], + "env": { + "TRUEREF_API_URL": "http://localhost:5173" + } + } + } +} +``` + +JetBrains also lets you set the working directory in the MCP server dialog, which is useful if you want the command to run from the TrueRef project root. + +## Using TrueRef MCP in Claude Code + +Claude Code supports both direct CLI configuration and project-level `.mcp.json` files. + +### Option A: add the HTTP server with the Claude CLI + +1. Start the app: + +```sh +npm run dev +``` + +2. Start the MCP HTTP server: + +```sh +npm run mcp:http +``` + +3. Register the server: + +```sh +claude mcp add --transport http trueref http://localhost:3001/mcp +``` + +4. Inside Claude Code, run: + +```text +/mcp +``` + +and verify that `trueref` is connected. + +### Option B: add the stdio server with the Claude CLI + +```sh +claude mcp add --transport stdio --env TRUEREF_API_URL=http://localhost:5173 trueref -- npm run mcp:start +``` + +### Option C: commit a project-scoped `.mcp.json` + +```json +{ + "mcpServers": { + "trueref": { + "command": "npm", + "args": ["run", "mcp:start"], + "env": { + "TRUEREF_API_URL": "http://localhost:5173" + } + } + } +} +``` + +Or, for HTTP transport: + +```json +{ + "mcpServers": { + "trueref": { + "type": "http", + "url": "http://localhost:3001/mcp" + } + } +} +``` + +### Claude Code usage rule + +To make Claude consistently use TrueRef for indexed libraries, add a rule file like this to your project: + +```markdown +--- +description: Use TrueRef to retrieve documentation for indexed libraries +alwaysApply: true +--- + +When answering questions about indexed libraries, always use the TrueRef MCP tools: +1. Call `resolve-library-id` with the library name and the user's question to get the library ID. +2. Call `query-docs` with the library ID and question to retrieve relevant documentation. +3. Use the returned documentation to answer accurately. + +Never rely on training data alone for library APIs that may have changed. +``` + +## Typical assistant workflow + +Whether you are using VS Code, IntelliJ, or Claude Code, the expected retrieval flow is: + +1. `resolve-library-id` + Find the correct repository or version identifier. +2. `query-docs` + Retrieve the actual documentation and code snippets for the user question. + +Example: + +1. Ask: `How do I configure React hooks correctly?` +2. Client calls `resolve-library-id` with `libraryName=react` +3. Client picks `/facebook/react` +4. Client calls `query-docs` with that library ID and the full question +5. Assistant answers from retrieved snippets rather than stale model memory + +## Docker deployment + +TrueRef ships a multi-stage `Dockerfile` and a `docker-compose.yml` that run the web app and MCP HTTP server as separate services. + +### Quick start with Docker Compose + +```sh +docker compose up --build +``` + +This builds the image and starts two services: + +| Service | Default port | Purpose | +|---------|-------------|---------| +| `web` | `3000` | SvelteKit web UI and REST API | +| `mcp` | `3001` | MCP HTTP server | + +The SQLite database is stored in a named Docker volume (`trueref-data`) and persists across restarts. + +### Environment variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `DATABASE_URL` | `/data/trueref.db` | Path to the SQLite database inside the container | +| `PORT` | `3000` | Port the web app listens on | +| `HOST` | `0.0.0.0` | Bind address for the web app | +| `TRUEREF_API_URL` | `http://localhost:3000` | Base URL the MCP server uses to reach the REST API | +| `MCP_PORT` | `3001` | Port the MCP HTTP server listens on | + +Override them in `docker-compose.yml` or pass them with `-e` flags. + +### Run the web app only + +```sh +docker build -t trueref . +docker run -p 3000:3000 -v trueref-data:/data trueref +``` + +The entrypoint runs database migrations automatically before starting the web server. + +### Run the MCP HTTP server only + +```sh +docker run -p 3001:3001 \ + -e TRUEREF_API_URL=http://your-trueref-host:3000 \ + trueref mcp +``` + +### Using the Docker MCP endpoint with VS Code + +Once both containers are running, point VS Code at the MCP HTTP endpoint: + +```json +{ + "servers": { + "trueref": { + "type": "http", + "url": "http://localhost:3001/mcp" + } + } +} +``` + +### Using the Docker MCP endpoint with IntelliJ IDEA + +```json +{ + "mcpServers": { + "trueref": { + "type": "http", + "url": "http://localhost:3001/mcp" + } + } +} +``` + +### Using the Docker MCP endpoint with Claude Code + +```sh +claude mcp add --transport http trueref http://localhost:3001/mcp +``` + +Verify the connection inside Claude Code: + +```text +/mcp +``` + +### Health checks + +| Endpoint | Expected response | +|----------|------------------| +| `http://localhost:3000/api/v1/libs` | JSON array of indexed libraries | +| `http://localhost:3001/ping` | `{"ok":true}` | + +### Mounting a local repository + +To index a local folder from the host, mount it into the container and add it via the API: + +```sh +docker run -p 3000:3000 \ + -v trueref-data:/data \ + -v /path/to/my-library:/repos/my-library:ro \ + trueref +``` + +Then add the mounted path through the web UI or REST API using `/repos/my-library` as the source URL. + +## Testing + +Run the test suite: + +```sh +npm test +``` + +Run checks: + +```sh +npm run check +npm run lint +``` + +## Available scripts + +```sh +npm run dev +npm run build +npm run preview +npm run check +npm run lint +npm run format +npm test +npm run mcp:start +npm run mcp:http +npm run db:push +npm run db:generate +npm run db:migrate +npm run db:studio +``` + +## Troubleshooting + +### `DATABASE_URL is not set` + +Export `DATABASE_URL` before starting the app or running any database command. + +### MCP server starts but returns no useful results + +Check that: + +- the web app is running +- `TRUEREF_API_URL` points to the correct base URL +- the target repository has already been indexed + +### `Library not found` from `query-docs` + +Run `resolve-library-id` first and use the exact returned library ID. + +### `Library is currently being indexed` + +Wait for the indexing job to finish, then retry the request. + +### Search quality is weak + +Configure an embedding provider in Settings. Without embeddings, search is keyword-only. + +## Security notes + +- Treat GitHub tokens and embedding API keys as secrets. +- Prefer user-local or environment-based secret injection over committing credentials to MCP config files. +- Only connect trusted MCP clients and trusted MCP servers. + +## License + +No license file is included in this repository at the moment.