Files
trueref/README.md
Giancarmine Salucci 5c738be624 docs: add Docker deployment section to README
Documents the full Docker deployment workflow including docker compose
quickstart, environment variable reference, web-only and MCP-only run
modes, health check endpoints, IDE MCP config snippets (VS Code,
IntelliJ, Claude Code), and mounting local repositories as read-only
volumes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 14:30:18 +01:00

805 lines
17 KiB
Markdown

# TrueRef
TrueRef is a self-hosted documentation retrieval platform for AI coding assistants.
It indexes public or private repositories, stores searchable documentation and code snippets locally, exposes a context7-compatible REST API, and ships an MCP server that lets tools like VS Code, IntelliJ, and Claude Code query your indexed libraries with the same two-step flow used by context7:
1. `resolve-library-id`
2. `query-docs`
The goal is straightforward: give your assistants accurate, current, version-aware documentation from repositories you control.
## What TrueRef does
- Indexes GitHub repositories and local folders.
- Parses Markdown and source files into searchable snippets.
- Stores metadata in SQLite.
- Supports keyword search out of the box with SQLite FTS5.
- Supports semantic and hybrid search when an embedding provider is configured.
- Exposes REST endpoints for library discovery and documentation retrieval.
- Exposes an MCP server over stdio and HTTP for AI clients.
- Provides a SvelteKit web UI for repository management, search, indexing jobs, and embedding settings.
- Supports repository-level configuration through `trueref.json` or `context7.json`.
## Project status
TrueRef is under active development. The current codebase already includes:
- repository management
- indexing jobs and recovery on restart
- local and GitHub crawling
- version registration support
- context7-compatible API endpoints
- MCP stdio and HTTP transports
- configurable embedding providers
## Architecture
TrueRef is organized into four main layers:
1. Web UI
SvelteKit application for adding repositories, monitoring indexing, searching content, and configuring embeddings.
2. REST API
Endpoints under `/api/v1/*` for repository management, search, schema discovery, job status, and settings.
3. Indexing pipeline
Crawlers, parsers, chunking logic, snippet storage, and optional embedding generation.
4. MCP server
A thin compatibility layer that forwards `resolve-library-id` and `query-docs` requests to the TrueRef REST API.
At runtime, the app uses SQLite via `better-sqlite3` and Drizzle, plus optional embedding providers for semantic retrieval.
## Tech stack
- TypeScript
- SvelteKit
- SQLite (`better-sqlite3`)
- Drizzle ORM / Drizzle Kit
- Tailwind CSS
- Vitest
- Model Context Protocol SDK (`@modelcontextprotocol/sdk`)
## Core concepts
### Libraries
Each indexed repository becomes a library with an ID such as `/facebook/react`.
### Versions
Libraries can register version tags. Queries can target a specific version by using a library ID such as `/facebook/react/v18.3.0`.
### Snippets
Documents are split into code and info snippets. These snippets are what search and MCP responses return.
### Rules
Repository rules defined in `trueref.json` are prepended to `query-docs` responses so assistants get usage constraints along with the retrieved content.
## Requirements
- Node.js 20+
- npm
- SQLite-compatible filesystem access
Optional:
- an OpenAI-compatible embedding API if you want semantic search
- `@xenova/transformers` if you want local embedding generation
## Getting started
### 1. Install dependencies
```sh
npm install
```
### 2. Configure the database
TrueRef requires `DATABASE_URL`.
Example:
```sh
export DATABASE_URL="$PWD/trueref.db"
```
You can place the same value in your shell profile or your local environment loading mechanism.
### 3. Create or migrate the schema
For a fresh local setup:
```sh
npm run db:migrate
```
During development, if you are iterating on schema changes, `db:push` is also available:
```sh
npm run db:push
```
### 4. Start the web app
```sh
npm run dev
```
By default, the app is served by Vite on `http://localhost:5173`.
## Local development workflow
Typical loop:
1. Start the app with `npm run dev`.
2. Open the UI in the browser.
3. Add a repository from the Repositories page.
4. Wait for indexing to finish.
5. Test retrieval either in the Search page, through the REST API, or through MCP.
## Web UI usage
The current UI covers three main workflows.
### Repositories
Use the main page to:
- add a GitHub repository
- add a local folder
- trigger re-indexing
- delete an indexed repository
- monitor active indexing jobs
### Search
Use the Search page to:
- search for a library by name
- query documentation for a chosen library
- inspect the text returned by the context endpoint
### Settings
Use the Settings page to configure embeddings.
Supported modes:
- `none`: keyword search only
- `openai`: any OpenAI-compatible embeddings endpoint
- `local`: local ONNX model via `@xenova/transformers`
If no embedding provider is configured, TrueRef still works with FTS5-only search.
## Repository configuration
TrueRef supports a repository-local config file named `trueref.json`.
For compatibility with existing context7-style repositories, `context7.json` is also supported.
### What the config controls
- project display title
- project description
- included folders
- excluded folders
- excluded file names
- assistant-facing usage rules
- previously released versions
### Example `trueref.json`
```json
{
"$schema": "http://localhost:5173/api/v1/schema/trueref-config.json",
"projectTitle": "My Internal SDK",
"description": "Internal SDK for billing, auth, and event ingestion.",
"folders": ["src/", "docs/"],
"excludeFolders": ["tests/", "fixtures/", "node_modules/"],
"excludeFiles": ["CHANGELOG.md"],
"rules": [
"Prefer named imports over wildcard imports.",
"Use the async client API for all network calls."
],
"previousVersions": [
{
"tag": "v1.2.3",
"title": "Version 1.2.3"
}
]
}
```
### JSON schema
You can point your editor to the live schema served by TrueRef:
```text
http://localhost:5173/api/v1/schema/trueref-config.json
```
That enables validation and autocomplete in editors that support JSON Schema references.
## REST API
TrueRef exposes a context7-compatible API under `/api/v1`.
### Library search
Find candidate library IDs.
```http
GET /api/v1/libs/search?libraryName=react&query=hooks&type=json
```
Example:
```sh
curl "http://localhost:5173/api/v1/libs/search?libraryName=react&query=hooks&type=json"
```
### Documentation retrieval
Fetch snippets for a specific library ID.
```http
GET /api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt
```
Example:
```sh
curl "http://localhost:5173/api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt"
```
### Repository management
List repositories:
```sh
curl "http://localhost:5173/api/v1/libs"
```
Add a local repository:
```sh
curl -X POST "http://localhost:5173/api/v1/libs" \
-H "content-type: application/json" \
-d '{
"source": "local",
"sourceUrl": "/absolute/path/to/my-library",
"title": "My Library"
}'
```
Add a GitHub repository:
```sh
curl -X POST "http://localhost:5173/api/v1/libs" \
-H "content-type: application/json" \
-d '{
"source": "github",
"sourceUrl": "https://github.com/facebook/react",
"title": "React"
}'
```
Trigger re-indexing:
```sh
curl -X POST "http://localhost:5173/api/v1/libs/%2Ffacebook%2Freact/index"
```
Check job status:
```sh
curl "http://localhost:5173/api/v1/jobs"
curl "http://localhost:5173/api/v1/jobs/<job-id>"
```
### Response formats
The two search endpoints support:
- `type=json` for structured consumption
- `type=txt` for direct LLM injection
## Search behavior
### Without embeddings
TrueRef uses keyword search with SQLite FTS5.
This is the simplest setup and requires no external model or API key.
### With embeddings
TrueRef generates embeddings for snippets and uses hybrid retrieval:
- vector similarity
- keyword matching
- ranking fusion
This improves recall for conceptual and natural-language questions.
## Embedding configuration
Embeddings are configured through the UI or through the embedding settings API.
### OpenAI-compatible provider
Provide:
- base URL
- API key
- model name
- optional dimensions override
### Local provider
Install:
```sh
npm install @xenova/transformers
```
Then select `local` in Settings.
### Disable embeddings
Set the provider to `none` to use FTS5-only retrieval.
## MCP server
TrueRef includes an MCP server in `src/mcp/index.ts`.
It exposes two tools:
- `resolve-library-id`
- `query-docs`
The tool names and argument shapes intentionally mirror context7 so existing workflows can switch over with minimal changes.
### Environment variables
The MCP server uses:
- `TRUEREF_API_URL`
Base URL of the TrueRef web app. Default: `http://localhost:5173`
- `PORT`
Used only for HTTP transport. Default: `3001`
### Start MCP over stdio
```sh
npm run mcp:start
```
This is appropriate when the client launches the server as a local subprocess.
### Start MCP over HTTP
```sh
npm run mcp:http
```
That starts a streamable HTTP MCP endpoint at:
```text
http://localhost:3001/mcp
```
Health check:
```text
http://localhost:3001/ping
```
### Recommended topology
For local development:
1. run the web app on `http://localhost:5173`
2. run the MCP HTTP server on `http://localhost:3001/mcp`
3. point your editor or assistant client at the MCP server
If your client supports local stdio servers well, you can also skip the HTTP transport and let the client run `npm run mcp:start` directly.
## Using TrueRef MCP in VS Code
VS Code supports MCP servers through `mcp.json`.
Official docs support both workspace-level `.vscode/mcp.json` and user-profile `mcp.json`.
### Option A: use the HTTP transport
1. Start the app:
```sh
npm run dev
```
2. Start the MCP HTTP server:
```sh
npm run mcp:http
```
3. Create `.vscode/mcp.json`:
```json
{
"servers": {
"trueref": {
"type": "http",
"url": "http://localhost:3001/mcp"
}
}
}
```
4. In VS Code, trust and start the server when prompted.
5. Open Chat and ask a library question that should use TrueRef.
### Option B: let VS Code spawn the stdio server
```json
{
"servers": {
"trueref": {
"command": "npm",
"args": ["run", "mcp:start"],
"env": {
"TRUEREF_API_URL": "http://localhost:5173"
}
}
}
}
```
Notes:
- Workspace configuration is the right choice when the MCP server should run inside the project and be shared with the team.
- User-profile configuration is better if you want one TrueRef client available across many workspaces.
- In VS Code, you can manage server lifecycle from the Command Palette with the MCP commands.
## Using TrueRef MCP in IntelliJ IDEA
JetBrains AI Assistant supports MCP server connections from the IDE settings UI.
Path:
```text
Settings | Tools | AI Assistant | Model Context Protocol (MCP)
```
### HTTP setup
1. Start the app:
```sh
npm run dev
```
2. Start the MCP HTTP server:
```sh
npm run mcp:http
```
3. In IntelliJ IDEA, add a new MCP server and use this JSON configuration:
```json
{
"mcpServers": {
"trueref": {
"type": "http",
"url": "http://localhost:3001/mcp"
}
}
}
```
4. Apply the configuration and verify the status shows the server as connected.
### Stdio setup
If you prefer to let IntelliJ launch the server process directly, use:
```json
{
"mcpServers": {
"trueref": {
"command": "npm",
"args": ["run", "mcp:start"],
"env": {
"TRUEREF_API_URL": "http://localhost:5173"
}
}
}
}
```
JetBrains also lets you set the working directory in the MCP server dialog, which is useful if you want the command to run from the TrueRef project root.
## Using TrueRef MCP in Claude Code
Claude Code supports both direct CLI configuration and project-level `.mcp.json` files.
### Option A: add the HTTP server with the Claude CLI
1. Start the app:
```sh
npm run dev
```
2. Start the MCP HTTP server:
```sh
npm run mcp:http
```
3. Register the server:
```sh
claude mcp add --transport http trueref http://localhost:3001/mcp
```
4. Inside Claude Code, run:
```text
/mcp
```
and verify that `trueref` is connected.
### Option B: add the stdio server with the Claude CLI
```sh
claude mcp add --transport stdio --env TRUEREF_API_URL=http://localhost:5173 trueref -- npm run mcp:start
```
### Option C: commit a project-scoped `.mcp.json`
```json
{
"mcpServers": {
"trueref": {
"command": "npm",
"args": ["run", "mcp:start"],
"env": {
"TRUEREF_API_URL": "http://localhost:5173"
}
}
}
}
```
Or, for HTTP transport:
```json
{
"mcpServers": {
"trueref": {
"type": "http",
"url": "http://localhost:3001/mcp"
}
}
}
```
### Claude Code usage rule
To make Claude consistently use TrueRef for indexed libraries, add a rule file like this to your project:
```markdown
---
description: Use TrueRef to retrieve documentation for indexed libraries
alwaysApply: true
---
When answering questions about indexed libraries, always use the TrueRef MCP tools:
1. Call `resolve-library-id` with the library name and the user's question to get the library ID.
2. Call `query-docs` with the library ID and question to retrieve relevant documentation.
3. Use the returned documentation to answer accurately.
Never rely on training data alone for library APIs that may have changed.
```
## Typical assistant workflow
Whether you are using VS Code, IntelliJ, or Claude Code, the expected retrieval flow is:
1. `resolve-library-id`
Find the correct repository or version identifier.
2. `query-docs`
Retrieve the actual documentation and code snippets for the user question.
Example:
1. Ask: `How do I configure React hooks correctly?`
2. Client calls `resolve-library-id` with `libraryName=react`
3. Client picks `/facebook/react`
4. Client calls `query-docs` with that library ID and the full question
5. Assistant answers from retrieved snippets rather than stale model memory
## Docker deployment
TrueRef ships a multi-stage `Dockerfile` and a `docker-compose.yml` that run the web app and MCP HTTP server as separate services.
### Quick start with Docker Compose
```sh
docker compose up --build
```
This builds the image and starts two services:
| Service | Default port | Purpose |
|---------|-------------|---------|
| `web` | `3000` | SvelteKit web UI and REST API |
| `mcp` | `3001` | MCP HTTP server |
The SQLite database is stored in a named Docker volume (`trueref-data`) and persists across restarts.
### Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
| `DATABASE_URL` | `/data/trueref.db` | Path to the SQLite database inside the container |
| `PORT` | `3000` | Port the web app listens on |
| `HOST` | `0.0.0.0` | Bind address for the web app |
| `TRUEREF_API_URL` | `http://localhost:3000` | Base URL the MCP server uses to reach the REST API |
| `MCP_PORT` | `3001` | Port the MCP HTTP server listens on |
Override them in `docker-compose.yml` or pass them with `-e` flags.
### Run the web app only
```sh
docker build -t trueref .
docker run -p 3000:3000 -v trueref-data:/data trueref
```
The entrypoint runs database migrations automatically before starting the web server.
### Run the MCP HTTP server only
```sh
docker run -p 3001:3001 \
-e TRUEREF_API_URL=http://your-trueref-host:3000 \
trueref mcp
```
### Using the Docker MCP endpoint with VS Code
Once both containers are running, point VS Code at the MCP HTTP endpoint:
```json
{
"servers": {
"trueref": {
"type": "http",
"url": "http://localhost:3001/mcp"
}
}
}
```
### Using the Docker MCP endpoint with IntelliJ IDEA
```json
{
"mcpServers": {
"trueref": {
"type": "http",
"url": "http://localhost:3001/mcp"
}
}
}
```
### Using the Docker MCP endpoint with Claude Code
```sh
claude mcp add --transport http trueref http://localhost:3001/mcp
```
Verify the connection inside Claude Code:
```text
/mcp
```
### Health checks
| Endpoint | Expected response |
|----------|------------------|
| `http://localhost:3000/api/v1/libs` | JSON array of indexed libraries |
| `http://localhost:3001/ping` | `{"ok":true}` |
### Mounting a local repository
To index a local folder from the host, mount it into the container and add it via the API:
```sh
docker run -p 3000:3000 \
-v trueref-data:/data \
-v /path/to/my-library:/repos/my-library:ro \
trueref
```
Then add the mounted path through the web UI or REST API using `/repos/my-library` as the source URL.
## Testing
Run the test suite:
```sh
npm test
```
Run checks:
```sh
npm run check
npm run lint
```
## Available scripts
```sh
npm run dev
npm run build
npm run preview
npm run check
npm run lint
npm run format
npm test
npm run mcp:start
npm run mcp:http
npm run db:push
npm run db:generate
npm run db:migrate
npm run db:studio
```
## Troubleshooting
### `DATABASE_URL is not set`
Export `DATABASE_URL` before starting the app or running any database command.
### MCP server starts but returns no useful results
Check that:
- the web app is running
- `TRUEREF_API_URL` points to the correct base URL
- the target repository has already been indexed
### `Library not found` from `query-docs`
Run `resolve-library-id` first and use the exact returned library ID.
### `Library is currently being indexed`
Wait for the indexing job to finish, then retry the request.
### Search quality is weak
Configure an embedding provider in Settings. Without embeddings, search is keyword-only.
## Security notes
- Treat GitHub tokens and embedding API keys as secrets.
- Prefer user-local or environment-based secret injection over committing credentials to MCP config files.
- Only connect trusted MCP clients and trusted MCP servers.
## License
No license file is included in this repository at the moment.