# TrueRef

TrueRef is a self-hosted documentation retrieval platform for AI coding assistants.

It indexes public or private repositories, stores searchable documentation and code snippets locally, exposes a context7-compatible REST API, and ships an MCP server that lets tools like VS Code, IntelliJ, and Claude Code query your indexed libraries with the same two-step flow used by context7:

1. `resolve-library-id`
2. `query-docs`

The goal is straightforward: give your assistants accurate, current, version-aware documentation from repositories you control.

## What TrueRef does

- Indexes GitHub repositories and local folders.
- Parses Markdown and source files into searchable snippets.
- Stores metadata in SQLite.
- Supports keyword search out of the box with SQLite FTS5.
- Supports semantic and hybrid search when an embedding provider is configured.
- Exposes REST endpoints for library discovery and documentation retrieval.
- Exposes an MCP server over stdio and HTTP for AI clients.
- Provides a SvelteKit web UI for repository management, search, indexing jobs, and embedding settings.
- Supports repository-level configuration through `trueref.json` or `context7.json`.

## Project status

TrueRef is under active development. The current codebase already includes:

- repository management
- indexing jobs and recovery on restart
- local and GitHub crawling
- version registration support
- context7-compatible API endpoints
- MCP stdio and HTTP transports
- configurable embedding providers

## Architecture

TrueRef is organized into four main layers:

1. Web UI
   SvelteKit application for adding repositories, monitoring indexing, searching content, and configuring embeddings.
2. REST API
   Endpoints under `/api/v1/*` for repository management, search, schema discovery, job status, and settings.
3. Indexing pipeline
   Crawlers, parsers, chunking logic, snippet storage, and optional embedding generation.
4. MCP server
   A thin compatibility layer that forwards `resolve-library-id` and `query-docs` requests to the TrueRef REST API.

At runtime, the app uses SQLite via `better-sqlite3` and Drizzle, plus optional embedding providers for semantic retrieval.

## Tech stack

- TypeScript
- SvelteKit
- SQLite (`better-sqlite3`)
- Drizzle ORM / Drizzle Kit
- Tailwind CSS
- Vitest
- Model Context Protocol SDK (`@modelcontextprotocol/sdk`)

## Core concepts

### Libraries

Each indexed repository becomes a library with an ID such as `/facebook/react`.

### Versions

Libraries can register version tags. Queries can target a specific version by using a library ID such as `/facebook/react/v18.3.0`.

### Snippets

Documents are split into code and info snippets. These snippets are what search and MCP responses return.

### Rules

Repository rules defined in `trueref.json` are prepended to `query-docs` responses so assistants get usage constraints along with the retrieved content.

## Requirements

- Node.js 20+
- npm
- SQLite-compatible filesystem access

Optional:

- an OpenAI-compatible embedding API if you want semantic search
- `@xenova/transformers` if you want local embedding generation

## Getting started

### 1. Install dependencies

```sh
npm install
```

### 2. Configure the database

TrueRef requires `DATABASE_URL`.

Example:

```sh
export DATABASE_URL="$PWD/trueref.db"
```

You can place the same value in your shell profile or your local environment loading mechanism.

### 3. Create or migrate the schema

For a fresh local setup:

```sh
npm run db:migrate
```

During development, if you are iterating on schema changes, `db:push` is also available:

```sh
npm run db:push
```

### 4. Start the web app

```sh
npm run dev
```

By default, the app is served by Vite on `http://localhost:5173`.

## Local development workflow

Typical loop:

1. Start the app with `npm run dev`.
2. Open the UI in the browser.
3. Add a repository from the Repositories page.
4. Wait for indexing to finish.
5. Test retrieval either in the Search page, through the REST API, or through MCP.

## Web UI usage

The current UI covers three main workflows.

### Repositories

Use the main page to:

- add a GitHub repository
- add a local folder
- trigger re-indexing
- delete an indexed repository
- monitor active indexing jobs

### Search

Use the Search page to:

- search for a library by name
- query documentation for a chosen library
- inspect the text returned by the context endpoint

### Settings

Use the Settings page to configure embeddings.

Supported modes:

- `none`: keyword search only
- `openai`: any OpenAI-compatible embeddings endpoint
- `local`: local ONNX model via `@xenova/transformers`

If no embedding provider is configured, TrueRef still works with FTS5-only search.

## Repository configuration

TrueRef supports a repository-local config file named `trueref.json`.

For compatibility with existing context7-style repositories, `context7.json` is also supported.

### What the config controls

- project display title
- project description
- included folders
- excluded folders
- excluded file names
- assistant-facing usage rules
- previously released versions

### Example `trueref.json`

```json
{
	"$schema": "http://localhost:5173/api/v1/schema/trueref-config.json",
	"projectTitle": "My Internal SDK",
	"description": "Internal SDK for billing, auth, and event ingestion.",
	"folders": ["src/", "docs/"],
	"excludeFolders": ["tests/", "fixtures/", "node_modules/"],
	"excludeFiles": ["CHANGELOG.md"],
	"rules": [
		"Prefer named imports over wildcard imports.",
		"Use the async client API for all network calls."
	],
	"previousVersions": [
		{
			"tag": "v1.2.3",
			"title": "Version 1.2.3"
		}
	]
}
```

### JSON schema

You can point your editor to the live schema served by TrueRef:

```text
http://localhost:5173/api/v1/schema/trueref-config.json
```

That enables validation and autocomplete in editors that support JSON Schema references.

## REST API

TrueRef exposes a context7-compatible API under `/api/v1`.

### Library search

Find candidate library IDs.

```http
GET /api/v1/libs/search?libraryName=react&query=hooks&type=json
```

Example:

```sh
curl "http://localhost:5173/api/v1/libs/search?libraryName=react&query=hooks&type=json"
```

### Documentation retrieval

Fetch snippets for a specific library ID.

```http
GET /api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt
```

Example:

```sh
curl "http://localhost:5173/api/v1/context?libraryId=/facebook/react&query=how%20to%20use%20useEffect&type=txt"
```

### Repository management

List repositories:

```sh
curl "http://localhost:5173/api/v1/libs"
```

Add a local repository:

```sh
curl -X POST "http://localhost:5173/api/v1/libs" \
	-H "content-type: application/json" \
	-d '{
		"source": "local",
		"sourceUrl": "/absolute/path/to/my-library",
		"title": "My Library"
	}'
```

Add a GitHub repository:

```sh
curl -X POST "http://localhost:5173/api/v1/libs" \
	-H "content-type: application/json" \
	-d '{
		"source": "github",
		"sourceUrl": "https://github.com/facebook/react",
		"title": "React"
	}'
```

Trigger re-indexing:

```sh
curl -X POST "http://localhost:5173/api/v1/libs/%2Ffacebook%2Freact/index"
```

Check job status:

```sh
curl "http://localhost:5173/api/v1/jobs"
curl "http://localhost:5173/api/v1/jobs/<job-id>"
```

### Response formats

The two search endpoints support:

- `type=json` for structured consumption
- `type=txt` for direct LLM injection

## Search behavior

### Without embeddings

TrueRef uses keyword search with SQLite FTS5.

This is the simplest setup and requires no external model or API key.

### With embeddings

TrueRef generates embeddings for snippets and uses hybrid retrieval:

- vector similarity
- keyword matching
- ranking fusion

This improves recall for conceptual and natural-language questions.

## Embedding configuration

Embeddings are configured through the UI or through the embedding settings API.

### OpenAI-compatible provider

Provide:

- base URL
- API key
- model name
- optional dimensions override

### Local provider

Install:

```sh
npm install @xenova/transformers
```

Then select `local` in Settings.

### Disable embeddings

Set the provider to `none` to use FTS5-only retrieval.

## MCP server

TrueRef includes an MCP server in `src/mcp/index.ts`.

It exposes two tools:

- `resolve-library-id`
- `query-docs`

The tool names and argument shapes intentionally mirror context7 so existing workflows can switch over with minimal changes.

### Environment variables

The MCP server uses:

- `TRUEREF_API_URL`
  Base URL of the TrueRef web app. Default: `http://localhost:5173`
- `PORT`
  Used only for HTTP transport. Default: `3001`

### Start MCP over stdio

```sh
npm run mcp:start
```

This is appropriate when the client launches the server as a local subprocess.

### Start MCP over HTTP

```sh
npm run mcp:http
```

That starts a streamable HTTP MCP endpoint at:

```text
http://localhost:3001/mcp
```

Health check:

```text
http://localhost:3001/ping
```

### Recommended topology

For local development:

1. run the web app on `http://localhost:5173`
2. run the MCP HTTP server on `http://localhost:3001/mcp`
3. point your editor or assistant client at the MCP server

If your client supports local stdio servers well, you can also skip the HTTP transport and let the client run `npm run mcp:start` directly.

## Using TrueRef MCP in VS Code

VS Code supports MCP servers through `mcp.json`.

Official docs support both workspace-level `.vscode/mcp.json` and user-profile `mcp.json`.

### Option A: use the HTTP transport

1. Start the app:

```sh
npm run dev
```

2. Start the MCP HTTP server:

```sh
npm run mcp:http
```

3. Create `.vscode/mcp.json`:

```json
{
	"servers": {
		"trueref": {
			"type": "http",
			"url": "http://localhost:3001/mcp"
		}
	}
}
```

4. In VS Code, trust and start the server when prompted.
5. Open Chat and ask a library question that should use TrueRef.

### Option B: let VS Code spawn the stdio server

```json
{
	"servers": {
		"trueref": {
			"command": "npm",
			"args": ["run", "mcp:start"],
			"env": {
				"TRUEREF_API_URL": "http://localhost:5173"
			}
		}
	}
}
```

Notes:

- Workspace configuration is the right choice when the MCP server should run inside the project and be shared with the team.
- User-profile configuration is better if you want one TrueRef client available across many workspaces.
- In VS Code, you can manage server lifecycle from the Command Palette with the MCP commands.

## Using TrueRef MCP in IntelliJ IDEA

JetBrains AI Assistant supports MCP server connections from the IDE settings UI.

Path:

```text
Settings | Tools | AI Assistant | Model Context Protocol (MCP)
```

### HTTP setup

1. Start the app:

```sh
npm run dev
```

2. Start the MCP HTTP server:

```sh
npm run mcp:http
```

3. In IntelliJ IDEA, add a new MCP server and use this JSON configuration:

```json
{
	"mcpServers": {
		"trueref": {
			"type": "http",
			"url": "http://localhost:3001/mcp"
		}
	}
}
```

4. Apply the configuration and verify the status shows the server as connected.

### Stdio setup

If you prefer to let IntelliJ launch the server process directly, use:

```json
{
	"mcpServers": {
		"trueref": {
			"command": "npm",
			"args": ["run", "mcp:start"],
			"env": {
				"TRUEREF_API_URL": "http://localhost:5173"
			}
		}
	}
}
```

JetBrains also lets you set the working directory in the MCP server dialog, which is useful if you want the command to run from the TrueRef project root.

## Using TrueRef MCP in Claude Code

Claude Code supports both direct CLI configuration and project-level `.mcp.json` files.

### Option A: add the HTTP server with the Claude CLI

1. Start the app:

```sh
npm run dev
```

2. Start the MCP HTTP server:

```sh
npm run mcp:http
```

3. Register the server:

```sh
claude mcp add --transport http trueref http://localhost:3001/mcp
```

4. Inside Claude Code, run:

```text
/mcp
```

and verify that `trueref` is connected.

### Option B: add the stdio server with the Claude CLI

```sh
claude mcp add --transport stdio --env TRUEREF_API_URL=http://localhost:5173 trueref -- npm run mcp:start
```

### Option C: commit a project-scoped `.mcp.json`

```json
{
	"mcpServers": {
		"trueref": {
			"command": "npm",
			"args": ["run", "mcp:start"],
			"env": {
				"TRUEREF_API_URL": "http://localhost:5173"
			}
		}
	}
}
```

Or, for HTTP transport:

```json
{
	"mcpServers": {
		"trueref": {
			"type": "http",
			"url": "http://localhost:3001/mcp"
		}
	}
}
```

### Claude Code usage rule

To make Claude consistently use TrueRef for indexed libraries, add a rule file like this to your project:

```markdown
---
description: Use TrueRef to retrieve documentation for indexed libraries
alwaysApply: true
---

When answering questions about indexed libraries, always use the TrueRef MCP tools:

1. Call `resolve-library-id` with the library name and the user's question to get the library ID.
2. Call `query-docs` with the library ID and question to retrieve relevant documentation.
3. Use the returned documentation to answer accurately.

Never rely on training data alone for library APIs that may have changed.
```

## Typical assistant workflow

Whether you are using VS Code, IntelliJ, or Claude Code, the expected retrieval flow is:

1. `resolve-library-id`
   Find the correct repository or version identifier.
2. `query-docs`
   Retrieve the actual documentation and code snippets for the user question.

Example:

1. Ask: `How do I configure React hooks correctly?`
2. Client calls `resolve-library-id` with `libraryName=react`
3. Client picks `/facebook/react`
4. Client calls `query-docs` with that library ID and the full question
5. Assistant answers from retrieved snippets rather than stale model memory

## Docker deployment

TrueRef ships a multi-stage `Dockerfile` and a `docker-compose.yml` that run the web app and MCP HTTP server as separate services.

### Quick start with Docker Compose

```sh
docker compose up --build
```

This builds the image and starts two services:

| Service | Default port | Purpose                       |
| ------- | ------------ | ----------------------------- |
| `web`   | `3000`       | SvelteKit web UI and REST API |
| `mcp`   | `3001`       | MCP HTTP server               |

The SQLite database is stored in a named Docker volume (`trueref-data`) and persists across restarts.

### Corporate deployment

TrueRef supports deployment in corporate environments with private git repositories hosted on Bitbucket Server/Data Center and self-hosted GitLab instances (TRUEREF-0019).

#### Required setup

1. **Export your corporate CA certificate** (Windows):
   - Open `certmgr.msc` → Trusted Root Certification Authorities
   - Right-click your corporate CA → All Tasks → Export
   - Choose Base64-encoded X.509 (.CER) format
   - Save to a known location (e.g., `C:\certs\corp-ca.crt`)

2. **Generate personal access tokens**:
   - Bitbucket Server: Settings → HTTP access tokens (requires `REPO_READ` permission)
   - GitLab: User Settings → Access Tokens (requires `read_repository` scope)

3. **Update `.env` file**:

```env
# Corporate CA certificate path (PEM or DER — auto-detected)
CORP_CA_CERT=C:/path/to/corp-ca.crt

# Git remote hostnames (without https://)
BITBUCKET_HOST=bitbucket.corp.example.com
GITLAB_HOST=gitlab.corp.example.com

# Personal access tokens (NEVER commit these)
GIT_TOKEN_BITBUCKET=your-bitbucket-token-here
GIT_TOKEN_GITLAB=your-gitlab-token-here
```

4. **Uncomment volume mounts in `docker-compose.yml`**:

```yaml
services:
  web:
    volumes:
      - trueref-data:/data
      - ${USERPROFILE:-$HOME}/.ssh:/root/.ssh:ro
      - ${USERPROFILE:-$HOME}/.gitconfig:/root/.gitconfig:ro
      - ${CORP_CA_CERT}:/certs/corp-ca.crt:ro
    environment:
      BITBUCKET_HOST: '${BITBUCKET_HOST}'
      GITLAB_HOST: '${GITLAB_HOST}'
      GIT_TOKEN_BITBUCKET: '${GIT_TOKEN_BITBUCKET}'
      GIT_TOKEN_GITLAB: '${GIT_TOKEN_GITLAB}'
```

5. **Start the services**:

```sh
docker compose up --build
```

#### How it works

The Docker entrypoint script (`docker-entrypoint.sh`) runs these steps in order:

1. **Trust corporate CA**: Detects PEM/DER format and installs the certificate at the OS level so git, curl, and Node.js fetch all trust it automatically.
2. **Fix SSH key permissions**: Corrects world-readable permissions from Windows NTFS mounts so SSH works properly.
3. **Configure git credentials**: Sets up per-host credential helpers that provide the correct username and token for each remote.

This setup works for:

- HTTPS cloning with personal access tokens
- SSH cloning with mounted SSH keys
- On-premise servers with custom CA certificates
- Mixed environments (multiple git remotes with different credentials)

#### SSH authentication (alternative to HTTPS)

For long-lived deployments, SSH authentication is recommended:

1. Generate an SSH key pair if you don't have one:

   ```sh
   ssh-keygen -t ed25519 -C "trueref@your-company.com"
   ```

2. Add the public key to your git hosting service:
   - Bitbucket: Settings → SSH keys
   - GitLab: User Settings → SSH Keys

3. Ensure your `~/.ssh/config` has the correct host entries:

   ```
   Host bitbucket.corp.example.com
     IdentityFile ~/.ssh/id_ed25519
     User git
   ```

4. The Docker Compose configuration already mounts `~/.ssh` read-only — no additional changes needed.

### Environment variables

| Variable          | Default                 | Description                                        |
| ----------------- | ----------------------- | -------------------------------------------------- |
| `DATABASE_URL`    | `/data/trueref.db`      | Path to the SQLite database inside the container   |
| `PORT`            | `3000`                  | Port the web app listens on                        |
| `HOST`            | `0.0.0.0`               | Bind address for the web app                       |
| `TRUEREF_API_URL` | `http://localhost:3000` | Base URL the MCP server uses to reach the REST API |
| `MCP_PORT`        | `3001`                  | Port the MCP HTTP server listens on                |

Override them in `docker-compose.yml` or pass them with `-e` flags.

### Run the web app only

```sh
docker build -t trueref .
docker run -p 3000:3000 -v trueref-data:/data trueref
```

The entrypoint runs database migrations automatically before starting the web server.

### Run the MCP HTTP server only

```sh
docker run -p 3001:3001 \
  -e TRUEREF_API_URL=http://your-trueref-host:3000 \
  trueref mcp
```

### Using the Docker MCP endpoint with VS Code

Once both containers are running, point VS Code at the MCP HTTP endpoint:

```json
{
	"servers": {
		"trueref": {
			"type": "http",
			"url": "http://localhost:3001/mcp"
		}
	}
}
```

### Using the Docker MCP endpoint with IntelliJ IDEA

```json
{
	"mcpServers": {
		"trueref": {
			"type": "http",
			"url": "http://localhost:3001/mcp"
		}
	}
}
```

### Using the Docker MCP endpoint with Claude Code

```sh
claude mcp add --transport http trueref http://localhost:3001/mcp
```

Verify the connection inside Claude Code:

```text
/mcp
```

### Health checks

| Endpoint                            | Expected response               |
| ----------------------------------- | ------------------------------- |
| `http://localhost:3000/api/v1/libs` | JSON array of indexed libraries |
| `http://localhost:3001/ping`        | `{"ok":true}`                   |

### Mounting a local repository

To index a local folder from the host, mount it into the container and add it via the API:

```sh
docker run -p 3000:3000 \
  -v trueref-data:/data \
  -v /path/to/my-library:/repos/my-library:ro \
  trueref
```

Then add the mounted path through the web UI or REST API using `/repos/my-library` as the source URL.

## Testing

Run the test suite:

```sh
npm test
```

Run checks:

```sh
npm run check
npm run lint
```

## Available scripts

```sh
npm run dev
npm run build
npm run preview
npm run check
npm run lint
npm run format
npm test
npm run mcp:start
npm run mcp:http
npm run db:push
npm run db:generate
npm run db:migrate
npm run db:studio
```

## Troubleshooting

### `DATABASE_URL is not set`

Export `DATABASE_URL` before starting the app or running any database command.

### MCP server starts but returns no useful results

Check that:

- the web app is running
- `TRUEREF_API_URL` points to the correct base URL
- the target repository has already been indexed

### `Library not found` from `query-docs`

Run `resolve-library-id` first and use the exact returned library ID.

### `Library is currently being indexed`

Wait for the indexing job to finish, then retry the request.

### Search quality is weak

Configure an embedding provider in Settings. Without embeddings, search is keyword-only.

## Security notes

- Treat GitHub tokens and embedding API keys as secrets.
- Prefer user-local or environment-based secret injection over committing credentials to MCP config files.
- Only connect trusted MCP clients and trusted MCP servers.

## License

No license file is included in this repository at the moment.