chore(FEEDBACK-0001): linting
This commit is contained in:
78
README.md
78
README.md
@@ -38,13 +38,13 @@ TrueRef is under active development. The current codebase already includes:
|
|||||||
TrueRef is organized into four main layers:
|
TrueRef is organized into four main layers:
|
||||||
|
|
||||||
1. Web UI
|
1. Web UI
|
||||||
SvelteKit application for adding repositories, monitoring indexing, searching content, and configuring embeddings.
|
SvelteKit application for adding repositories, monitoring indexing, searching content, and configuring embeddings.
|
||||||
2. REST API
|
2. REST API
|
||||||
Endpoints under `/api/v1/*` for repository management, search, schema discovery, job status, and settings.
|
Endpoints under `/api/v1/*` for repository management, search, schema discovery, job status, and settings.
|
||||||
3. Indexing pipeline
|
3. Indexing pipeline
|
||||||
Crawlers, parsers, chunking logic, snippet storage, and optional embedding generation.
|
Crawlers, parsers, chunking logic, snippet storage, and optional embedding generation.
|
||||||
4. MCP server
|
4. MCP server
|
||||||
A thin compatibility layer that forwards `resolve-library-id` and `query-docs` requests to the TrueRef REST API.
|
A thin compatibility layer that forwards `resolve-library-id` and `query-docs` requests to the TrueRef REST API.
|
||||||
|
|
||||||
At runtime, the app uses SQLite via `better-sqlite3` and Drizzle, plus optional embedding providers for semantic retrieval.
|
At runtime, the app uses SQLite via `better-sqlite3` and Drizzle, plus optional embedding providers for semantic retrieval.
|
||||||
|
|
||||||
@@ -367,9 +367,9 @@ The tool names and argument shapes intentionally mirror context7 so existing wor
|
|||||||
The MCP server uses:
|
The MCP server uses:
|
||||||
|
|
||||||
- `TRUEREF_API_URL`
|
- `TRUEREF_API_URL`
|
||||||
Base URL of the TrueRef web app. Default: `http://localhost:5173`
|
Base URL of the TrueRef web app. Default: `http://localhost:5173`
|
||||||
- `PORT`
|
- `PORT`
|
||||||
Used only for HTTP transport. Default: `3001`
|
Used only for HTTP transport. Default: `3001`
|
||||||
|
|
||||||
### Start MCP over stdio
|
### Start MCP over stdio
|
||||||
|
|
||||||
@@ -602,6 +602,7 @@ alwaysApply: true
|
|||||||
---
|
---
|
||||||
|
|
||||||
When answering questions about indexed libraries, always use the TrueRef MCP tools:
|
When answering questions about indexed libraries, always use the TrueRef MCP tools:
|
||||||
|
|
||||||
1. Call `resolve-library-id` with the library name and the user's question to get the library ID.
|
1. Call `resolve-library-id` with the library name and the user's question to get the library ID.
|
||||||
2. Call `query-docs` with the library ID and question to retrieve relevant documentation.
|
2. Call `query-docs` with the library ID and question to retrieve relevant documentation.
|
||||||
3. Use the returned documentation to answer accurately.
|
3. Use the returned documentation to answer accurately.
|
||||||
@@ -614,9 +615,9 @@ Never rely on training data alone for library APIs that may have changed.
|
|||||||
Whether you are using VS Code, IntelliJ, or Claude Code, the expected retrieval flow is:
|
Whether you are using VS Code, IntelliJ, or Claude Code, the expected retrieval flow is:
|
||||||
|
|
||||||
1. `resolve-library-id`
|
1. `resolve-library-id`
|
||||||
Find the correct repository or version identifier.
|
Find the correct repository or version identifier.
|
||||||
2. `query-docs`
|
2. `query-docs`
|
||||||
Retrieve the actual documentation and code snippets for the user question.
|
Retrieve the actual documentation and code snippets for the user question.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
@@ -638,10 +639,10 @@ docker compose up --build
|
|||||||
|
|
||||||
This builds the image and starts two services:
|
This builds the image and starts two services:
|
||||||
|
|
||||||
| Service | Default port | Purpose |
|
| Service | Default port | Purpose |
|
||||||
|---------|-------------|---------|
|
| ------- | ------------ | ----------------------------- |
|
||||||
| `web` | `3000` | SvelteKit web UI and REST API |
|
| `web` | `3000` | SvelteKit web UI and REST API |
|
||||||
| `mcp` | `3001` | MCP HTTP server |
|
| `mcp` | `3001` | MCP HTTP server |
|
||||||
|
|
||||||
The SQLite database is stored in a named Docker volume (`trueref-data`) and persists across restarts.
|
The SQLite database is stored in a named Docker volume (`trueref-data`) and persists across restarts.
|
||||||
|
|
||||||
@@ -687,10 +688,10 @@ services:
|
|||||||
- ${USERPROFILE:-$HOME}/.gitconfig:/root/.gitconfig:ro
|
- ${USERPROFILE:-$HOME}/.gitconfig:/root/.gitconfig:ro
|
||||||
- ${CORP_CA_CERT}:/certs/corp-ca.crt:ro
|
- ${CORP_CA_CERT}:/certs/corp-ca.crt:ro
|
||||||
environment:
|
environment:
|
||||||
BITBUCKET_HOST: "${BITBUCKET_HOST}"
|
BITBUCKET_HOST: '${BITBUCKET_HOST}'
|
||||||
GITLAB_HOST: "${GITLAB_HOST}"
|
GITLAB_HOST: '${GITLAB_HOST}'
|
||||||
GIT_TOKEN_BITBUCKET: "${GIT_TOKEN_BITBUCKET}"
|
GIT_TOKEN_BITBUCKET: '${GIT_TOKEN_BITBUCKET}'
|
||||||
GIT_TOKEN_GITLAB: "${GIT_TOKEN_GITLAB}"
|
GIT_TOKEN_GITLAB: '${GIT_TOKEN_GITLAB}'
|
||||||
```
|
```
|
||||||
|
|
||||||
5. **Start the services**:
|
5. **Start the services**:
|
||||||
@@ -708,6 +709,7 @@ The Docker entrypoint script (`docker-entrypoint.sh`) runs these steps in order:
|
|||||||
3. **Configure git credentials**: Sets up per-host credential helpers that provide the correct username and token for each remote.
|
3. **Configure git credentials**: Sets up per-host credential helpers that provide the correct username and token for each remote.
|
||||||
|
|
||||||
This setup works for:
|
This setup works for:
|
||||||
|
|
||||||
- HTTPS cloning with personal access tokens
|
- HTTPS cloning with personal access tokens
|
||||||
- SSH cloning with mounted SSH keys
|
- SSH cloning with mounted SSH keys
|
||||||
- On-premise servers with custom CA certificates
|
- On-premise servers with custom CA certificates
|
||||||
@@ -718,6 +720,7 @@ This setup works for:
|
|||||||
For long-lived deployments, SSH authentication is recommended:
|
For long-lived deployments, SSH authentication is recommended:
|
||||||
|
|
||||||
1. Generate an SSH key pair if you don't have one:
|
1. Generate an SSH key pair if you don't have one:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
ssh-keygen -t ed25519 -C "trueref@your-company.com"
|
ssh-keygen -t ed25519 -C "trueref@your-company.com"
|
||||||
```
|
```
|
||||||
@@ -727,6 +730,7 @@ For long-lived deployments, SSH authentication is recommended:
|
|||||||
- GitLab: User Settings → SSH Keys
|
- GitLab: User Settings → SSH Keys
|
||||||
|
|
||||||
3. Ensure your `~/.ssh/config` has the correct host entries:
|
3. Ensure your `~/.ssh/config` has the correct host entries:
|
||||||
|
|
||||||
```
|
```
|
||||||
Host bitbucket.corp.example.com
|
Host bitbucket.corp.example.com
|
||||||
IdentityFile ~/.ssh/id_ed25519
|
IdentityFile ~/.ssh/id_ed25519
|
||||||
@@ -737,13 +741,13 @@ For long-lived deployments, SSH authentication is recommended:
|
|||||||
|
|
||||||
### Environment variables
|
### Environment variables
|
||||||
|
|
||||||
| Variable | Default | Description |
|
| Variable | Default | Description |
|
||||||
|----------|---------|-------------|
|
| ----------------- | ----------------------- | -------------------------------------------------- |
|
||||||
| `DATABASE_URL` | `/data/trueref.db` | Path to the SQLite database inside the container |
|
| `DATABASE_URL` | `/data/trueref.db` | Path to the SQLite database inside the container |
|
||||||
| `PORT` | `3000` | Port the web app listens on |
|
| `PORT` | `3000` | Port the web app listens on |
|
||||||
| `HOST` | `0.0.0.0` | Bind address for the web app |
|
| `HOST` | `0.0.0.0` | Bind address for the web app |
|
||||||
| `TRUEREF_API_URL` | `http://localhost:3000` | Base URL the MCP server uses to reach the REST API |
|
| `TRUEREF_API_URL` | `http://localhost:3000` | Base URL the MCP server uses to reach the REST API |
|
||||||
| `MCP_PORT` | `3001` | Port the MCP HTTP server listens on |
|
| `MCP_PORT` | `3001` | Port the MCP HTTP server listens on |
|
||||||
|
|
||||||
Override them in `docker-compose.yml` or pass them with `-e` flags.
|
Override them in `docker-compose.yml` or pass them with `-e` flags.
|
||||||
|
|
||||||
@@ -770,12 +774,12 @@ Once both containers are running, point VS Code at the MCP HTTP endpoint:
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"servers": {
|
"servers": {
|
||||||
"trueref": {
|
"trueref": {
|
||||||
"type": "http",
|
"type": "http",
|
||||||
"url": "http://localhost:3001/mcp"
|
"url": "http://localhost:3001/mcp"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -783,12 +787,12 @@ Once both containers are running, point VS Code at the MCP HTTP endpoint:
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"trueref": {
|
"trueref": {
|
||||||
"type": "http",
|
"type": "http",
|
||||||
"url": "http://localhost:3001/mcp"
|
"url": "http://localhost:3001/mcp"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -806,10 +810,10 @@ Verify the connection inside Claude Code:
|
|||||||
|
|
||||||
### Health checks
|
### Health checks
|
||||||
|
|
||||||
| Endpoint | Expected response |
|
| Endpoint | Expected response |
|
||||||
|----------|------------------|
|
| ----------------------------------- | ------------------------------- |
|
||||||
| `http://localhost:3000/api/v1/libs` | JSON array of indexed libraries |
|
| `http://localhost:3000/api/v1/libs` | JSON array of indexed libraries |
|
||||||
| `http://localhost:3001/ping` | `{"ok":true}` |
|
| `http://localhost:3001/ping` | `{"ok":true}` |
|
||||||
|
|
||||||
### Mounting a local repository
|
### Mounting a local repository
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ services:
|
|||||||
web:
|
web:
|
||||||
build: .
|
build: .
|
||||||
ports:
|
ports:
|
||||||
- "3000:3000"
|
- '3000:3000'
|
||||||
volumes:
|
volumes:
|
||||||
- trueref-data:/data
|
- trueref-data:/data
|
||||||
# Corporate deployment support (TRUEREF-0019)
|
# Corporate deployment support (TRUEREF-0019)
|
||||||
@@ -24,10 +24,10 @@ services:
|
|||||||
build: .
|
build: .
|
||||||
command: mcp
|
command: mcp
|
||||||
ports:
|
ports:
|
||||||
- "3001:3001"
|
- '3001:3001'
|
||||||
environment:
|
environment:
|
||||||
TRUEREF_API_URL: http://web:3000
|
TRUEREF_API_URL: http://web:3000
|
||||||
MCP_PORT: "3001"
|
MCP_PORT: '3001'
|
||||||
depends_on:
|
depends_on:
|
||||||
- web
|
- web
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
|
|||||||
132
docs/FINDINGS.md
132
docs/FINDINGS.md
@@ -37,85 +37,85 @@ Add subsequent research below this section.
|
|||||||
|
|
||||||
- Task: Refresh only stale documentation after changes to retrieval, formatters, token budgeting, and parser behavior.
|
- Task: Refresh only stale documentation after changes to retrieval, formatters, token budgeting, and parser behavior.
|
||||||
- Files inspected:
|
- Files inspected:
|
||||||
- `docs/docs_cache_state.yaml`
|
- `docs/docs_cache_state.yaml`
|
||||||
- `docs/ARCHITECTURE.md`
|
- `docs/ARCHITECTURE.md`
|
||||||
- `docs/CODE_STYLE.md`
|
- `docs/CODE_STYLE.md`
|
||||||
- `docs/FINDINGS.md`
|
- `docs/FINDINGS.md`
|
||||||
- `package.json`
|
- `package.json`
|
||||||
- `src/routes/api/v1/context/+server.ts`
|
- `src/routes/api/v1/context/+server.ts`
|
||||||
- `src/lib/server/api/formatters.ts`
|
- `src/lib/server/api/formatters.ts`
|
||||||
- `src/lib/server/api/token-budget.ts`
|
- `src/lib/server/api/token-budget.ts`
|
||||||
- `src/lib/server/search/query-preprocessor.ts`
|
- `src/lib/server/search/query-preprocessor.ts`
|
||||||
- `src/lib/server/search/search.service.ts`
|
- `src/lib/server/search/search.service.ts`
|
||||||
- `src/lib/server/search/hybrid.search.service.ts`
|
- `src/lib/server/search/hybrid.search.service.ts`
|
||||||
- `src/lib/server/mappers/context-response.mapper.ts`
|
- `src/lib/server/mappers/context-response.mapper.ts`
|
||||||
- `src/lib/server/models/context-response.ts`
|
- `src/lib/server/models/context-response.ts`
|
||||||
- `src/lib/server/models/search-result.ts`
|
- `src/lib/server/models/search-result.ts`
|
||||||
- `src/lib/server/parser/index.ts`
|
- `src/lib/server/parser/index.ts`
|
||||||
- `src/lib/server/parser/code.parser.ts`
|
- `src/lib/server/parser/code.parser.ts`
|
||||||
- `src/lib/server/parser/markdown.parser.ts`
|
- `src/lib/server/parser/markdown.parser.ts`
|
||||||
- Findings:
|
- Findings:
|
||||||
- The documentation cache was trusted, but the architecture summary no longer captured current retrieval behavior: query preprocessing now sanitizes punctuation-heavy input for FTS5, semantic mode can bypass FTS entirely, and auto or hybrid retrieval can fall back to vector search when keyword search returns no candidates.
|
- The documentation cache was trusted, but the architecture summary no longer captured current retrieval behavior: query preprocessing now sanitizes punctuation-heavy input for FTS5, semantic mode can bypass FTS entirely, and auto or hybrid retrieval can fall back to vector search when keyword search returns no candidates.
|
||||||
- Plain-text and JSON context formatting now carry repository and version metadata, and the text formatter emits an explicit no-results section instead of an empty body.
|
- Plain-text and JSON context formatting now carry repository and version metadata, and the text formatter emits an explicit no-results section instead of an empty body.
|
||||||
- Token budgeting now skips individual over-budget snippets and continues evaluating lower-ranked candidates, which changes the response-selection behavior described at the architecture level.
|
- Token budgeting now skips individual over-budget snippets and continues evaluating lower-ranked candidates, which changes the response-selection behavior described at the architecture level.
|
||||||
- Parser coverage now explicitly includes Markdown, code, config, HTML-like, and plain-text inputs, so the architecture summary needed to reflect that broader file-type handling.
|
- Parser coverage now explicitly includes Markdown, code, config, HTML-like, and plain-text inputs, so the architecture summary needed to reflect that broader file-type handling.
|
||||||
- The conventions documented in CODE_STYLE.md still match the current repository: strict TypeScript, tab indentation, ESM imports, Prettier and ESLint flat config, and pragmatic service-oriented server modules.
|
- The conventions documented in CODE_STYLE.md still match the current repository: strict TypeScript, tab indentation, ESM imports, Prettier and ESLint flat config, and pragmatic service-oriented server modules.
|
||||||
- Risks / follow-ups:
|
- Risks / follow-ups:
|
||||||
- Future cache invalidation should continue to distinguish between behavioral changes that affect architecture docs and localized implementation changes that do not affect the style guide.
|
- Future cache invalidation should continue to distinguish between behavioral changes that affect architecture docs and localized implementation changes that do not affect the style guide.
|
||||||
- If the public API contract becomes externally versioned, the new context metadata fields likely deserve a dedicated API document instead of only architecture-level coverage.
|
- If the public API contract becomes externally versioned, the new context metadata fields likely deserve a dedicated API document instead of only architecture-level coverage.
|
||||||
|
|
||||||
### 2026-03-27 — FEEDBACK-0001 planning research
|
### 2026-03-27 — FEEDBACK-0001 planning research
|
||||||
|
|
||||||
- Task: Plan the retrieval-fix iteration covering FTS query safety, hybrid fallback, empty-result behavior, result metadata, token budgeting, and parser chunking.
|
- Task: Plan the retrieval-fix iteration covering FTS query safety, hybrid fallback, empty-result behavior, result metadata, token budgeting, and parser chunking.
|
||||||
- Files inspected:
|
- Files inspected:
|
||||||
- `package.json`
|
- `package.json`
|
||||||
- `src/routes/api/v1/context/+server.ts`
|
- `src/routes/api/v1/context/+server.ts`
|
||||||
- `src/lib/server/search/query-preprocessor.ts`
|
- `src/lib/server/search/query-preprocessor.ts`
|
||||||
- `src/lib/server/search/search.service.ts`
|
- `src/lib/server/search/search.service.ts`
|
||||||
- `src/lib/server/search/hybrid.search.service.ts`
|
- `src/lib/server/search/hybrid.search.service.ts`
|
||||||
- `src/lib/server/search/vector.search.ts`
|
- `src/lib/server/search/vector.search.ts`
|
||||||
- `src/lib/server/api/token-budget.ts`
|
- `src/lib/server/api/token-budget.ts`
|
||||||
- `src/lib/server/api/formatters.ts`
|
- `src/lib/server/api/formatters.ts`
|
||||||
- `src/lib/server/mappers/context-response.mapper.ts`
|
- `src/lib/server/mappers/context-response.mapper.ts`
|
||||||
- `src/lib/server/models/context-response.ts`
|
- `src/lib/server/models/context-response.ts`
|
||||||
- `src/lib/server/models/search-result.ts`
|
- `src/lib/server/models/search-result.ts`
|
||||||
- `src/lib/server/parser/code.parser.ts`
|
- `src/lib/server/parser/code.parser.ts`
|
||||||
- `src/lib/server/search/search.service.test.ts`
|
- `src/lib/server/search/search.service.test.ts`
|
||||||
- `src/lib/server/search/hybrid.search.service.test.ts`
|
- `src/lib/server/search/hybrid.search.service.test.ts`
|
||||||
- `src/lib/server/api/formatters.test.ts`
|
- `src/lib/server/api/formatters.test.ts`
|
||||||
- `src/lib/server/parser/code.parser.test.ts`
|
- `src/lib/server/parser/code.parser.test.ts`
|
||||||
- `src/routes/api/v1/api-contract.integration.test.ts`
|
- `src/routes/api/v1/api-contract.integration.test.ts`
|
||||||
- `src/mcp/tools/query-docs.ts`
|
- `src/mcp/tools/query-docs.ts`
|
||||||
- `src/mcp/client.ts`
|
- `src/mcp/client.ts`
|
||||||
- Findings:
|
- Findings:
|
||||||
- `better-sqlite3` `^12.6.2` backs the affected search path; the code already uses bound parameters for `MATCH`, so the practical fix belongs in query normalization and fallback handling rather than SQL string construction.
|
- `better-sqlite3` `^12.6.2` backs the affected search path; the code already uses bound parameters for `MATCH`, so the practical fix belongs in query normalization and fallback handling rather than SQL string construction.
|
||||||
- `query-preprocessor.ts` only strips parentheses and appends a trailing wildcard. Other code-like punctuation currently reaches the FTS execution path unsanitized.
|
- `query-preprocessor.ts` only strips parentheses and appends a trailing wildcard. Other code-like punctuation currently reaches the FTS execution path unsanitized.
|
||||||
- `search.service.ts` sends the preprocessed text directly to `snippets_fts MATCH ?` and already returns `[]` for blank processed queries.
|
- `search.service.ts` sends the preprocessed text directly to `snippets_fts MATCH ?` and already returns `[]` for blank processed queries.
|
||||||
- `hybrid.search.service.ts` always executes keyword search before semantic branching. In the current flow, an FTS parse failure can abort `auto`, `hybrid`, and `semantic` requests before vector retrieval runs.
|
- `hybrid.search.service.ts` always executes keyword search before semantic branching. In the current flow, an FTS parse failure can abort `auto`, `hybrid`, and `semantic` requests before vector retrieval runs.
|
||||||
- `vector.search.ts` already preserves `repositoryId`, `versionId`, and `profileId` filtering and does not need architectural changes for this iteration.
|
- `vector.search.ts` already preserves `repositoryId`, `versionId`, and `profileId` filtering and does not need architectural changes for this iteration.
|
||||||
- `token-budget.ts` stops at the first over-budget snippet instead of skipping that item and continuing through later ranked results.
|
- `token-budget.ts` stops at the first over-budget snippet instead of skipping that item and continuing through later ranked results.
|
||||||
- `formatContextTxt([], [])` returns an empty string, so `/api/v1/context?type=txt` can emit an empty `200 OK` body today.
|
- `formatContextTxt([], [])` returns an empty string, so `/api/v1/context?type=txt` can emit an empty `200 OK` body today.
|
||||||
- `context-response.mapper.ts` and `context-response.ts` expose snippet content and breadcrumb/page title but do not identify local TrueRef origin, repository source metadata, or normalized snippet origin labels.
|
- `context-response.mapper.ts` and `context-response.ts` expose snippet content and breadcrumb/page title but do not identify local TrueRef origin, repository source metadata, or normalized snippet origin labels.
|
||||||
- `code.parser.ts` splits primarily at top-level declarations; class/object member functions remain in coarse chunks, which limits method-level recall for camelCase API queries.
|
- `code.parser.ts` splits primarily at top-level declarations; class/object member functions remain in coarse chunks, which limits method-level recall for camelCase API queries.
|
||||||
- Existing relevant automated coverage is concentrated in the search, formatter, and parser unit tests; `/api/v1/context` contract coverage currently omits the context endpoint entirely.
|
- Existing relevant automated coverage is concentrated in the search, formatter, and parser unit tests; `/api/v1/context` contract coverage currently omits the context endpoint entirely.
|
||||||
- Risks / follow-ups:
|
- Risks / follow-ups:
|
||||||
- Response-shape changes must be additive because `src/mcp/client.ts`, `src/mcp/tools/query-docs.ts`, and UI consumers expect the current top-level keys to remain present.
|
- Response-shape changes must be additive because `src/mcp/client.ts`, `src/mcp/tools/query-docs.ts`, and UI consumers expect the current top-level keys to remain present.
|
||||||
- Parser improvements should stay inside `parseCodeFile()` and existing chunking helpers to avoid turning this fix iteration into a schema or pipeline redesign.
|
- Parser improvements should stay inside `parseCodeFile()` and existing chunking helpers to avoid turning this fix iteration into a schema or pipeline redesign.
|
||||||
|
|
||||||
### 2026-03-27 — FEEDBACK-0001 SQLite FTS5 syntax research
|
### 2026-03-27 — FEEDBACK-0001 SQLite FTS5 syntax research
|
||||||
|
|
||||||
- Task: Verify the FTS5 query-grammar constraints that affect punctuation-heavy local search queries.
|
- Task: Verify the FTS5 query-grammar constraints that affect punctuation-heavy local search queries.
|
||||||
- Files inspected:
|
- Files inspected:
|
||||||
- `package.json`
|
- `package.json`
|
||||||
- `src/lib/server/search/query-preprocessor.ts`
|
- `src/lib/server/search/query-preprocessor.ts`
|
||||||
- `src/lib/server/search/search.service.ts`
|
- `src/lib/server/search/search.service.ts`
|
||||||
- `src/lib/server/search/hybrid.search.service.ts`
|
- `src/lib/server/search/hybrid.search.service.ts`
|
||||||
- Findings:
|
- Findings:
|
||||||
- `better-sqlite3` is pinned at `^12.6.2` in `package.json`, and the application binds the `MATCH` string as a parameter instead of interpolating SQL directly.
|
- `better-sqlite3` is pinned at `^12.6.2` in `package.json`, and the application binds the `MATCH` string as a parameter instead of interpolating SQL directly.
|
||||||
- The canonical SQLite FTS5 docs state that barewords may contain letters, digits, underscore, non-ASCII characters, and the substitute character; strings containing other punctuation must be quoted or they become syntax errors in `MATCH` expressions.
|
- The canonical SQLite FTS5 docs state that barewords may contain letters, digits, underscore, non-ASCII characters, and the substitute character; strings containing other punctuation must be quoted or they become syntax errors in `MATCH` expressions.
|
||||||
- The same docs state that prefix search is expressed by placing `*` after the token or phrase, not inside quotes, which matches the current trailing-wildcard strategy in `query-preprocessor.ts`.
|
- The same docs state that prefix search is expressed by placing `*` after the token or phrase, not inside quotes, which matches the current trailing-wildcard strategy in `query-preprocessor.ts`.
|
||||||
- SQLite documents that FTS5 is stricter than FTS3/4 about unrecognized punctuation in query strings, which confirms that code-like user input should be normalized before it reaches `snippets_fts MATCH ?`.
|
- SQLite documents that FTS5 is stricter than FTS3/4 about unrecognized punctuation in query strings, which confirms that code-like user input should be normalized before it reaches `snippets_fts MATCH ?`.
|
||||||
- Based on the current code path, the practical fix remains application-side sanitization and fallback behavior in `query-preprocessor.ts` and `hybrid.search.service.ts`, not SQL construction changes.
|
- Based on the current code path, the practical fix remains application-side sanitization and fallback behavior in `query-preprocessor.ts` and `hybrid.search.service.ts`, not SQL construction changes.
|
||||||
- Risks / follow-ups:
|
- Risks / follow-ups:
|
||||||
- Over-sanitizing punctuation-heavy inputs could erase useful identifiers, so the implementation should preserve searchable alphanumeric and underscore tokens while discarding grammar-breaking punctuation.
|
- Over-sanitizing punctuation-heavy inputs could erase useful identifiers, so the implementation should preserve searchable alphanumeric and underscore tokens while discarding grammar-breaking punctuation.
|
||||||
- Prefix expansion should remain on the final searchable token only so the fix preserves current query-cost expectations and test semantics.
|
- Prefix expansion should remain on the final searchable token only so the fix preserves current query-cost expectations and test semantics.
|
||||||
|
|||||||
105
docs/PRD.md
105
docs/PRD.md
@@ -17,6 +17,7 @@ The core use case is enabling AI coding assistants (Claude Code, Cursor, Zed, et
|
|||||||
## 2. Problem Statement
|
## 2. Problem Statement
|
||||||
|
|
||||||
### 2.1 Context7's Limitations
|
### 2.1 Context7's Limitations
|
||||||
|
|
||||||
- The indexing and crawling backend is entirely private and closed-source.
|
- The indexing and crawling backend is entirely private and closed-source.
|
||||||
- Only public libraries already in the context7.com catalog are available.
|
- Only public libraries already in the context7.com catalog are available.
|
||||||
- Private, internal, or niche repositories cannot be added.
|
- Private, internal, or niche repositories cannot be added.
|
||||||
@@ -24,6 +25,7 @@ The core use case is enabling AI coding assistants (Claude Code, Cursor, Zed, et
|
|||||||
- No way to self-host for air-gapped or compliance-constrained environments.
|
- No way to self-host for air-gapped or compliance-constrained environments.
|
||||||
|
|
||||||
### 2.2 The Gap
|
### 2.2 The Gap
|
||||||
|
|
||||||
Teams with internal SDKs, private libraries, proprietary documentation, or a need for data sovereignty have no tooling that provides context7-equivalent LLM documentation retrieval.
|
Teams with internal SDKs, private libraries, proprietary documentation, or a need for data sovereignty have no tooling that provides context7-equivalent LLM documentation retrieval.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -31,6 +33,7 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
|
|||||||
## 3. Goals & Non-Goals
|
## 3. Goals & Non-Goals
|
||||||
|
|
||||||
### Goals
|
### Goals
|
||||||
|
|
||||||
- Replicate all context7 capabilities: library search, documentation retrieval, MCP tools (`resolve-library-id`, `query-docs`).
|
- Replicate all context7 capabilities: library search, documentation retrieval, MCP tools (`resolve-library-id`, `query-docs`).
|
||||||
- Support both GitHub-hosted and local filesystem repositories.
|
- Support both GitHub-hosted and local filesystem repositories.
|
||||||
- Provide a full indexing pipeline: crawl → parse → chunk → embed → store → query.
|
- Provide a full indexing pipeline: crawl → parse → chunk → embed → store → query.
|
||||||
@@ -42,6 +45,7 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
|
|||||||
- Self-hostable with minimal dependencies (SQLite-first, no external vector DB required).
|
- Self-hostable with minimal dependencies (SQLite-first, no external vector DB required).
|
||||||
|
|
||||||
### Non-Goals (v1)
|
### Non-Goals (v1)
|
||||||
|
|
||||||
- Authentication & authorization (deferred to a future version).
|
- Authentication & authorization (deferred to a future version).
|
||||||
- Skill generation (context7 CLI skill feature).
|
- Skill generation (context7 CLI skill feature).
|
||||||
- Multi-tenant SaaS mode.
|
- Multi-tenant SaaS mode.
|
||||||
@@ -54,9 +58,11 @@ Teams with internal SDKs, private libraries, proprietary documentation, or a nee
|
|||||||
## 4. Users & Personas
|
## 4. Users & Personas
|
||||||
|
|
||||||
### Primary: The Developer / Tech Lead
|
### Primary: The Developer / Tech Lead
|
||||||
|
|
||||||
Configures TrueRef, adds repositories, integrates the MCP server with their AI coding assistant. Technical, comfortable with CLI and config files.
|
Configures TrueRef, adds repositories, integrates the MCP server with their AI coding assistant. Technical, comfortable with CLI and config files.
|
||||||
|
|
||||||
### Secondary: The AI Coding Assistant
|
### Secondary: The AI Coding Assistant
|
||||||
|
|
||||||
The "user" at query time. Calls `resolve-library-id` and `query-docs` via MCP to retrieve documentation snippets for code generation.
|
The "user" at query time. Calls `resolve-library-id` and `query-docs` via MCP to retrieve documentation snippets for code generation.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -100,25 +106,27 @@ The "user" at query time. Calls `resolve-library-id` and `query-docs` via MCP to
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Technology Stack
|
### Technology Stack
|
||||||
| Layer | Technology |
|
|
||||||
|-------|-----------|
|
| Layer | Technology |
|
||||||
| Framework | SvelteKit (Node adapter) |
|
| ---------------- | ------------------------------------------------------------------ |
|
||||||
| Language | TypeScript |
|
| Framework | SvelteKit (Node adapter) |
|
||||||
| Database | SQLite via better-sqlite3 + drizzle-orm |
|
| Language | TypeScript |
|
||||||
| Full-Text Search | SQLite FTS5 |
|
| Database | SQLite via better-sqlite3 + drizzle-orm |
|
||||||
| Vector Search | SQLite `sqlite-vec` extension (cosine similarity) |
|
| Full-Text Search | SQLite FTS5 |
|
||||||
| Embeddings | Pluggable: local (transformers.js / ONNX) or OpenAI-compatible API |
|
| Vector Search | SQLite `sqlite-vec` extension (cosine similarity) |
|
||||||
| MCP Protocol | `@modelcontextprotocol/sdk` |
|
| Embeddings | Pluggable: local (transformers.js / ONNX) or OpenAI-compatible API |
|
||||||
| HTTP | SvelteKit API routes + optional standalone MCP HTTP server |
|
| MCP Protocol | `@modelcontextprotocol/sdk` |
|
||||||
| CSS | TailwindCSS v4 |
|
| HTTP | SvelteKit API routes + optional standalone MCP HTTP server |
|
||||||
| Testing | Vitest |
|
| CSS | TailwindCSS v4 |
|
||||||
| Linting | ESLint + Prettier |
|
| Testing | Vitest |
|
||||||
|
| Linting | ESLint + Prettier |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Data Model
|
## 6. Data Model
|
||||||
|
|
||||||
### 6.1 Repositories
|
### 6.1 Repositories
|
||||||
|
|
||||||
A `Repository` is the top-level entity. It maps to a GitHub repo or local directory.
|
A `Repository` is the top-level entity. It maps to a GitHub repo or local directory.
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -141,6 +149,7 @@ Repository {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### 6.2 Repository Versions
|
### 6.2 Repository Versions
|
||||||
|
|
||||||
```
|
```
|
||||||
RepositoryVersion {
|
RepositoryVersion {
|
||||||
id TEXT PRIMARY KEY
|
id TEXT PRIMARY KEY
|
||||||
@@ -153,6 +162,7 @@ RepositoryVersion {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### 6.3 Documents (parsed files)
|
### 6.3 Documents (parsed files)
|
||||||
|
|
||||||
```
|
```
|
||||||
Document {
|
Document {
|
||||||
id TEXT PRIMARY KEY
|
id TEXT PRIMARY KEY
|
||||||
@@ -169,6 +179,7 @@ Document {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### 6.4 Snippets (indexed chunks)
|
### 6.4 Snippets (indexed chunks)
|
||||||
|
|
||||||
```
|
```
|
||||||
Snippet {
|
Snippet {
|
||||||
id TEXT PRIMARY KEY
|
id TEXT PRIMARY KEY
|
||||||
@@ -186,6 +197,7 @@ Snippet {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### 6.5 Indexing Jobs
|
### 6.5 Indexing Jobs
|
||||||
|
|
||||||
```
|
```
|
||||||
IndexingJob {
|
IndexingJob {
|
||||||
id TEXT PRIMARY KEY
|
id TEXT PRIMARY KEY
|
||||||
@@ -203,6 +215,7 @@ IndexingJob {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### 6.6 Repository Configuration (`trueref.json`)
|
### 6.6 Repository Configuration (`trueref.json`)
|
||||||
|
|
||||||
```
|
```
|
||||||
RepositoryConfig {
|
RepositoryConfig {
|
||||||
repositoryId TEXT FK → Repository
|
repositoryId TEXT FK → Repository
|
||||||
@@ -221,15 +234,19 @@ RepositoryConfig {
|
|||||||
## 7. Core Features
|
## 7. Core Features
|
||||||
|
|
||||||
### F1: Repository Management
|
### F1: Repository Management
|
||||||
|
|
||||||
Add, remove, update, and list repositories. Support GitHub (public/private via token) and local filesystem sources. Trigger indexing on demand or on schedule.
|
Add, remove, update, and list repositories. Support GitHub (public/private via token) and local filesystem sources. Trigger indexing on demand or on schedule.
|
||||||
|
|
||||||
### F2: GitHub Crawler
|
### F2: GitHub Crawler
|
||||||
|
|
||||||
Fetch repository file trees via GitHub Trees API. Download file contents. Respect `trueref.json` include/exclude rules. Support rate limiting and incremental re-indexing (checksum-based).
|
Fetch repository file trees via GitHub Trees API. Download file contents. Respect `trueref.json` include/exclude rules. Support rate limiting and incremental re-indexing (checksum-based).
|
||||||
|
|
||||||
### F3: Local Filesystem Crawler
|
### F3: Local Filesystem Crawler
|
||||||
|
|
||||||
Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for file changes (optional).
|
Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for file changes (optional).
|
||||||
|
|
||||||
### F4: Document Parser & Chunker
|
### F4: Document Parser & Chunker
|
||||||
|
|
||||||
- Parse Markdown files into sections (heading-based splitting).
|
- Parse Markdown files into sections (heading-based splitting).
|
||||||
- Extract code blocks from Markdown.
|
- Extract code blocks from Markdown.
|
||||||
- Parse standalone code files into function/class-level chunks.
|
- Parse standalone code files into function/class-level chunks.
|
||||||
@@ -237,16 +254,19 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
|
|||||||
- Produce structured `Snippet` records (type: "code" or "info").
|
- Produce structured `Snippet` records (type: "code" or "info").
|
||||||
|
|
||||||
### F5: Embedding & Vector Storage
|
### F5: Embedding & Vector Storage
|
||||||
|
|
||||||
- Generate embeddings for each snippet using a pluggable embeddings backend.
|
- Generate embeddings for each snippet using a pluggable embeddings backend.
|
||||||
- Store embeddings as binary blobs in SQLite (sqlite-vec).
|
- Store embeddings as binary blobs in SQLite (sqlite-vec).
|
||||||
- Support fallback to FTS5-only search when no embedding provider is configured.
|
- Support fallback to FTS5-only search when no embedding provider is configured.
|
||||||
|
|
||||||
### F6: Semantic Search Engine
|
### F6: Semantic Search Engine
|
||||||
|
|
||||||
- Hybrid search: vector similarity + FTS5 keyword matching (BM25) with reciprocal rank fusion.
|
- Hybrid search: vector similarity + FTS5 keyword matching (BM25) with reciprocal rank fusion.
|
||||||
- Query-time retrieval: given `libraryId + query`, return ranked snippets.
|
- Query-time retrieval: given `libraryId + query`, return ranked snippets.
|
||||||
- Library search: given `libraryName + query`, return matching repositories.
|
- Library search: given `libraryName + query`, return matching repositories.
|
||||||
|
|
||||||
### F7: REST API (`/api/v1/*`)
|
### F7: REST API (`/api/v1/*`)
|
||||||
|
|
||||||
- `GET /api/v1/libs/search?query=&libraryName=` — search libraries (context7-compatible)
|
- `GET /api/v1/libs/search?query=&libraryName=` — search libraries (context7-compatible)
|
||||||
- `GET /api/v1/context?query=&libraryId=&type=json|txt` — fetch documentation
|
- `GET /api/v1/context?query=&libraryId=&type=json|txt` — fetch documentation
|
||||||
- `GET /api/v1/libs` — list all indexed libraries
|
- `GET /api/v1/libs` — list all indexed libraries
|
||||||
@@ -256,12 +276,14 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
|
|||||||
- `GET /api/v1/jobs/:id` — get indexing job status
|
- `GET /api/v1/jobs/:id` — get indexing job status
|
||||||
|
|
||||||
### F8: MCP Server
|
### F8: MCP Server
|
||||||
|
|
||||||
- Tool: `resolve-library-id` — search for libraries by name
|
- Tool: `resolve-library-id` — search for libraries by name
|
||||||
- Tool: `query-docs` — fetch documentation by libraryId + query
|
- Tool: `query-docs` — fetch documentation by libraryId + query
|
||||||
- Transport: stdio (primary), HTTP (optional)
|
- Transport: stdio (primary), HTTP (optional)
|
||||||
- Compatible with Claude Code, Cursor, and other MCP-aware tools
|
- Compatible with Claude Code, Cursor, and other MCP-aware tools
|
||||||
|
|
||||||
### F9: Web UI — Repository Dashboard
|
### F9: Web UI — Repository Dashboard
|
||||||
|
|
||||||
- List all repositories with status, snippet count, last indexed date
|
- List all repositories with status, snippet count, last indexed date
|
||||||
- Add/remove repositories (GitHub URL or local path)
|
- Add/remove repositories (GitHub URL or local path)
|
||||||
- Trigger re-indexing
|
- Trigger re-indexing
|
||||||
@@ -269,23 +291,27 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
|
|||||||
- View repository config (`trueref.json`)
|
- View repository config (`trueref.json`)
|
||||||
|
|
||||||
### F10: Web UI — Search Explorer
|
### F10: Web UI — Search Explorer
|
||||||
|
|
||||||
- Interactive search interface (resolve library → query docs)
|
- Interactive search interface (resolve library → query docs)
|
||||||
- Preview snippets with syntax highlighting
|
- Preview snippets with syntax highlighting
|
||||||
- View raw document content
|
- View raw document content
|
||||||
|
|
||||||
### F11: `trueref.json` Config Support
|
### F11: `trueref.json` Config Support
|
||||||
|
|
||||||
- Parse `trueref.json` from repo root (or `context7.json` for compatibility)
|
- Parse `trueref.json` from repo root (or `context7.json` for compatibility)
|
||||||
- Apply `folders`, `excludeFolders`, `excludeFiles` during crawling
|
- Apply `folders`, `excludeFolders`, `excludeFiles` during crawling
|
||||||
- Inject `rules` into LLM context alongside snippets
|
- Inject `rules` into LLM context alongside snippets
|
||||||
- Support `previousVersions` for versioned documentation
|
- Support `previousVersions` for versioned documentation
|
||||||
|
|
||||||
### F12: Indexing Pipeline & Job Queue
|
### F12: Indexing Pipeline & Job Queue
|
||||||
|
|
||||||
- SQLite-backed job queue (no external message broker required)
|
- SQLite-backed job queue (no external message broker required)
|
||||||
- Sequential processing with progress tracking
|
- Sequential processing with progress tracking
|
||||||
- Error recovery and retry logic
|
- Error recovery and retry logic
|
||||||
- Incremental re-indexing using file checksums
|
- Incremental re-indexing using file checksums
|
||||||
|
|
||||||
### F13: Version Support
|
### F13: Version Support
|
||||||
|
|
||||||
- Index specific git tags/branches per repository
|
- Index specific git tags/branches per repository
|
||||||
- Serve version-specific context when libraryId includes version (`/owner/repo/v1.2.3`)
|
- Serve version-specific context when libraryId includes version (`/owner/repo/v1.2.3`)
|
||||||
- UI for managing available versions
|
- UI for managing available versions
|
||||||
@@ -296,12 +322,13 @@ Walk directory trees. Apply include/exclude rules from `trueref.json`. Watch for
|
|||||||
|
|
||||||
TrueRef's REST API mirrors context7's `/api/v2/*` interface to allow drop-in compatibility:
|
TrueRef's REST API mirrors context7's `/api/v2/*` interface to allow drop-in compatibility:
|
||||||
|
|
||||||
| context7 Endpoint | TrueRef Endpoint | Notes |
|
| context7 Endpoint | TrueRef Endpoint | Notes |
|
||||||
|-------------------|-----------------|-------|
|
| ------------------------- | ------------------------- | -------------------------------------- |
|
||||||
| `GET /api/v2/libs/search` | `GET /api/v1/libs/search` | Same query params |
|
| `GET /api/v2/libs/search` | `GET /api/v1/libs/search` | Same query params |
|
||||||
| `GET /api/v2/context` | `GET /api/v1/context` | Same query params, same response shape |
|
| `GET /api/v2/context` | `GET /api/v1/context` | Same query params, same response shape |
|
||||||
|
|
||||||
The MCP tool names and input schemas are identical:
|
The MCP tool names and input schemas are identical:
|
||||||
|
|
||||||
- `resolve-library-id` with `libraryName` + `query`
|
- `resolve-library-id` with `libraryName` + `query`
|
||||||
- `query-docs` with `libraryId` + `query`
|
- `query-docs` with `libraryId` + `query`
|
||||||
|
|
||||||
@@ -312,20 +339,24 @@ Library IDs follow the same convention: `/owner/repo` or `/owner/repo/version`.
|
|||||||
## 9. Non-Functional Requirements
|
## 9. Non-Functional Requirements
|
||||||
|
|
||||||
### Performance
|
### Performance
|
||||||
|
|
||||||
- Library search: < 200ms p99
|
- Library search: < 200ms p99
|
||||||
- Documentation retrieval: < 500ms p99 for 20 snippets
|
- Documentation retrieval: < 500ms p99 for 20 snippets
|
||||||
- Indexing throughput: > 1,000 files/minute (GitHub API rate-limited)
|
- Indexing throughput: > 1,000 files/minute (GitHub API rate-limited)
|
||||||
|
|
||||||
### Reliability
|
### Reliability
|
||||||
|
|
||||||
- Failed indexing jobs must not corrupt existing indexed data
|
- Failed indexing jobs must not corrupt existing indexed data
|
||||||
- Atomic snippet replacement during re-indexing
|
- Atomic snippet replacement during re-indexing
|
||||||
|
|
||||||
### Portability
|
### Portability
|
||||||
|
|
||||||
- Single SQLite file for all data
|
- Single SQLite file for all data
|
||||||
- Runs on Linux, macOS, Windows (Node.js 20+)
|
- Runs on Linux, macOS, Windows (Node.js 20+)
|
||||||
- No required external services beyond optional embedding API
|
- No required external services beyond optional embedding API
|
||||||
|
|
||||||
### Scalability (v1 constraints)
|
### Scalability (v1 constraints)
|
||||||
|
|
||||||
- Designed for single-node deployment
|
- Designed for single-node deployment
|
||||||
- SQLite suitable for up to ~500 repositories, ~500k snippets
|
- SQLite suitable for up to ~500 repositories, ~500k snippets
|
||||||
|
|
||||||
@@ -333,26 +364,26 @@ Library IDs follow the same convention: `/owner/repo` or `/owner/repo/version`.
|
|||||||
|
|
||||||
## 10. Milestones & Feature Order
|
## 10. Milestones & Feature Order
|
||||||
|
|
||||||
| ID | Feature | Priority | Depends On |
|
| ID | Feature | Priority | Depends On |
|
||||||
|----|---------|----------|-----------|
|
| ------------ | ---------------------------------------- | -------- | -------------------------- |
|
||||||
| TRUEREF-0001 | Database schema & core data models | P0 | — |
|
| TRUEREF-0001 | Database schema & core data models | P0 | — |
|
||||||
| TRUEREF-0002 | Repository management service & REST API | P0 | TRUEREF-0001 |
|
| TRUEREF-0002 | Repository management service & REST API | P0 | TRUEREF-0001 |
|
||||||
| TRUEREF-0003 | GitHub repository crawler | P0 | TRUEREF-0001 |
|
| TRUEREF-0003 | GitHub repository crawler | P0 | TRUEREF-0001 |
|
||||||
| TRUEREF-0004 | Local filesystem crawler | P1 | TRUEREF-0001 |
|
| TRUEREF-0004 | Local filesystem crawler | P1 | TRUEREF-0001 |
|
||||||
| TRUEREF-0005 | Document parser & chunker | P0 | TRUEREF-0001 |
|
| TRUEREF-0005 | Document parser & chunker | P0 | TRUEREF-0001 |
|
||||||
| TRUEREF-0006 | SQLite FTS5 full-text search | P0 | TRUEREF-0005 |
|
| TRUEREF-0006 | SQLite FTS5 full-text search | P0 | TRUEREF-0005 |
|
||||||
| TRUEREF-0007 | Embedding generation & vector storage | P1 | TRUEREF-0005 |
|
| TRUEREF-0007 | Embedding generation & vector storage | P1 | TRUEREF-0005 |
|
||||||
| TRUEREF-0008 | Hybrid semantic search engine | P1 | TRUEREF-0006, TRUEREF-0007 |
|
| TRUEREF-0008 | Hybrid semantic search engine | P1 | TRUEREF-0006, TRUEREF-0007 |
|
||||||
| TRUEREF-0009 | Indexing pipeline & job queue | P0 | TRUEREF-0003, TRUEREF-0005 |
|
| TRUEREF-0009 | Indexing pipeline & job queue | P0 | TRUEREF-0003, TRUEREF-0005 |
|
||||||
| TRUEREF-0010 | REST API (search + context endpoints) | P0 | TRUEREF-0006, TRUEREF-0009 |
|
| TRUEREF-0010 | REST API (search + context endpoints) | P0 | TRUEREF-0006, TRUEREF-0009 |
|
||||||
| TRUEREF-0011 | MCP server (stdio transport) | P0 | TRUEREF-0010 |
|
| TRUEREF-0011 | MCP server (stdio transport) | P0 | TRUEREF-0010 |
|
||||||
| TRUEREF-0012 | MCP server (HTTP transport) | P1 | TRUEREF-0011 |
|
| TRUEREF-0012 | MCP server (HTTP transport) | P1 | TRUEREF-0011 |
|
||||||
| TRUEREF-0013 | `trueref.json` config file support | P0 | TRUEREF-0003 |
|
| TRUEREF-0013 | `trueref.json` config file support | P0 | TRUEREF-0003 |
|
||||||
| TRUEREF-0014 | Repository version management | P1 | TRUEREF-0003 |
|
| TRUEREF-0014 | Repository version management | P1 | TRUEREF-0003 |
|
||||||
| TRUEREF-0015 | Web UI — repository dashboard | P1 | TRUEREF-0002, TRUEREF-0009 |
|
| TRUEREF-0015 | Web UI — repository dashboard | P1 | TRUEREF-0002, TRUEREF-0009 |
|
||||||
| TRUEREF-0016 | Web UI — search explorer | P2 | TRUEREF-0010, TRUEREF-0015 |
|
| TRUEREF-0016 | Web UI — search explorer | P2 | TRUEREF-0010, TRUEREF-0015 |
|
||||||
| TRUEREF-0017 | Incremental re-indexing (checksum diff) | P1 | TRUEREF-0009 |
|
| TRUEREF-0017 | Incremental re-indexing (checksum diff) | P1 | TRUEREF-0009 |
|
||||||
| TRUEREF-0018 | Embedding provider configuration UI | P2 | TRUEREF-0007, TRUEREF-0015 |
|
| TRUEREF-0018 | Embedding provider configuration UI | P2 | TRUEREF-0007, TRUEREF-0015 |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -31,24 +31,26 @@ Represents an indexed library source (GitHub repo or local directory).
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const repositories = sqliteTable('repositories', {
|
export const repositories = sqliteTable('repositories', {
|
||||||
id: text('id').primaryKey(), // e.g. "/facebook/react" or "/local/my-sdk"
|
id: text('id').primaryKey(), // e.g. "/facebook/react" or "/local/my-sdk"
|
||||||
title: text('title').notNull(),
|
title: text('title').notNull(),
|
||||||
description: text('description'),
|
description: text('description'),
|
||||||
source: text('source', { enum: ['github', 'local'] }).notNull(),
|
source: text('source', { enum: ['github', 'local'] }).notNull(),
|
||||||
sourceUrl: text('source_url').notNull(), // GitHub URL or absolute local path
|
sourceUrl: text('source_url').notNull(), // GitHub URL or absolute local path
|
||||||
branch: text('branch').default('main'),
|
branch: text('branch').default('main'),
|
||||||
state: text('state', {
|
state: text('state', {
|
||||||
enum: ['pending', 'indexing', 'indexed', 'error']
|
enum: ['pending', 'indexing', 'indexed', 'error']
|
||||||
}).notNull().default('pending'),
|
})
|
||||||
totalSnippets: integer('total_snippets').default(0),
|
.notNull()
|
||||||
totalTokens: integer('total_tokens').default(0),
|
.default('pending'),
|
||||||
trustScore: real('trust_score').default(0), // 0.0–10.0
|
totalSnippets: integer('total_snippets').default(0),
|
||||||
benchmarkScore: real('benchmark_score').default(0), // 0.0–100.0
|
totalTokens: integer('total_tokens').default(0),
|
||||||
stars: integer('stars'),
|
trustScore: real('trust_score').default(0), // 0.0–10.0
|
||||||
githubToken: text('github_token'), // encrypted PAT for private repos
|
benchmarkScore: real('benchmark_score').default(0), // 0.0–100.0
|
||||||
lastIndexedAt: integer('last_indexed_at', { mode: 'timestamp' }),
|
stars: integer('stars'),
|
||||||
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
githubToken: text('github_token'), // encrypted PAT for private repos
|
||||||
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull(),
|
lastIndexedAt: integer('last_indexed_at', { mode: 'timestamp' }),
|
||||||
|
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
||||||
|
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -58,17 +60,20 @@ Tracks indexed git tags/branches beyond the default branch.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const repositoryVersions = sqliteTable('repository_versions', {
|
export const repositoryVersions = sqliteTable('repository_versions', {
|
||||||
id: text('id').primaryKey(), // e.g. "/facebook/react/v18.3.0"
|
id: text('id').primaryKey(), // e.g. "/facebook/react/v18.3.0"
|
||||||
repositoryId: text('repository_id').notNull()
|
repositoryId: text('repository_id')
|
||||||
.references(() => repositories.id, { onDelete: 'cascade' }),
|
.notNull()
|
||||||
tag: text('tag').notNull(), // git tag or branch name
|
.references(() => repositories.id, { onDelete: 'cascade' }),
|
||||||
title: text('title'),
|
tag: text('tag').notNull(), // git tag or branch name
|
||||||
state: text('state', {
|
title: text('title'),
|
||||||
enum: ['pending', 'indexing', 'indexed', 'error']
|
state: text('state', {
|
||||||
}).notNull().default('pending'),
|
enum: ['pending', 'indexing', 'indexed', 'error']
|
||||||
totalSnippets: integer('total_snippets').default(0),
|
})
|
||||||
indexedAt: integer('indexed_at', { mode: 'timestamp' }),
|
.notNull()
|
||||||
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
.default('pending'),
|
||||||
|
totalSnippets: integer('total_snippets').default(0),
|
||||||
|
indexedAt: integer('indexed_at', { mode: 'timestamp' }),
|
||||||
|
createdAt: integer('created_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -78,17 +83,17 @@ A parsed source file within a repository.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const documents = sqliteTable('documents', {
|
export const documents = sqliteTable('documents', {
|
||||||
id: text('id').primaryKey(), // UUID
|
id: text('id').primaryKey(), // UUID
|
||||||
repositoryId: text('repository_id').notNull()
|
repositoryId: text('repository_id')
|
||||||
.references(() => repositories.id, { onDelete: 'cascade' }),
|
.notNull()
|
||||||
versionId: text('version_id')
|
.references(() => repositories.id, { onDelete: 'cascade' }),
|
||||||
.references(() => repositoryVersions.id, { onDelete: 'cascade' }),
|
versionId: text('version_id').references(() => repositoryVersions.id, { onDelete: 'cascade' }),
|
||||||
filePath: text('file_path').notNull(), // relative path within repo
|
filePath: text('file_path').notNull(), // relative path within repo
|
||||||
title: text('title'),
|
title: text('title'),
|
||||||
language: text('language'), // e.g. "typescript", "markdown"
|
language: text('language'), // e.g. "typescript", "markdown"
|
||||||
tokenCount: integer('token_count').default(0),
|
tokenCount: integer('token_count').default(0),
|
||||||
checksum: text('checksum').notNull(), // SHA-256 of file content
|
checksum: text('checksum').notNull(), // SHA-256 of file content
|
||||||
indexedAt: integer('indexed_at', { mode: 'timestamp' }).notNull(),
|
indexedAt: integer('indexed_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -98,20 +103,21 @@ An indexed chunk of content, the atomic unit of search.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const snippets = sqliteTable('snippets', {
|
export const snippets = sqliteTable('snippets', {
|
||||||
id: text('id').primaryKey(), // UUID
|
id: text('id').primaryKey(), // UUID
|
||||||
documentId: text('document_id').notNull()
|
documentId: text('document_id')
|
||||||
.references(() => documents.id, { onDelete: 'cascade' }),
|
.notNull()
|
||||||
repositoryId: text('repository_id').notNull()
|
.references(() => documents.id, { onDelete: 'cascade' }),
|
||||||
.references(() => repositories.id, { onDelete: 'cascade' }),
|
repositoryId: text('repository_id')
|
||||||
versionId: text('version_id')
|
.notNull()
|
||||||
.references(() => repositoryVersions.id, { onDelete: 'cascade' }),
|
.references(() => repositories.id, { onDelete: 'cascade' }),
|
||||||
type: text('type', { enum: ['code', 'info'] }).notNull(),
|
versionId: text('version_id').references(() => repositoryVersions.id, { onDelete: 'cascade' }),
|
||||||
title: text('title'),
|
type: text('type', { enum: ['code', 'info'] }).notNull(),
|
||||||
content: text('content').notNull(), // searchable text / code
|
title: text('title'),
|
||||||
language: text('language'),
|
content: text('content').notNull(), // searchable text / code
|
||||||
breadcrumb: text('breadcrumb'), // e.g. "Installation > Getting Started"
|
language: text('language'),
|
||||||
tokenCount: integer('token_count').default(0),
|
breadcrumb: text('breadcrumb'), // e.g. "Installation > Getting Started"
|
||||||
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
tokenCount: integer('token_count').default(0),
|
||||||
|
createdAt: integer('created_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -121,12 +127,13 @@ Stores vector embeddings separately to keep snippets table lean.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const snippetEmbeddings = sqliteTable('snippet_embeddings', {
|
export const snippetEmbeddings = sqliteTable('snippet_embeddings', {
|
||||||
snippetId: text('snippet_id').primaryKey()
|
snippetId: text('snippet_id')
|
||||||
.references(() => snippets.id, { onDelete: 'cascade' }),
|
.primaryKey()
|
||||||
model: text('model').notNull(), // embedding model identifier
|
.references(() => snippets.id, { onDelete: 'cascade' }),
|
||||||
dimensions: integer('dimensions').notNull(),
|
model: text('model').notNull(), // embedding model identifier
|
||||||
embedding: blob('embedding').notNull(), // Float32Array as binary blob
|
dimensions: integer('dimensions').notNull(),
|
||||||
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
embedding: blob('embedding').notNull(), // Float32Array as binary blob
|
||||||
|
createdAt: integer('created_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -136,20 +143,23 @@ Tracks asynchronous indexing operations.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const indexingJobs = sqliteTable('indexing_jobs', {
|
export const indexingJobs = sqliteTable('indexing_jobs', {
|
||||||
id: text('id').primaryKey(), // UUID
|
id: text('id').primaryKey(), // UUID
|
||||||
repositoryId: text('repository_id').notNull()
|
repositoryId: text('repository_id')
|
||||||
.references(() => repositories.id, { onDelete: 'cascade' }),
|
.notNull()
|
||||||
versionId: text('version_id'),
|
.references(() => repositories.id, { onDelete: 'cascade' }),
|
||||||
status: text('status', {
|
versionId: text('version_id'),
|
||||||
enum: ['queued', 'running', 'done', 'failed']
|
status: text('status', {
|
||||||
}).notNull().default('queued'),
|
enum: ['queued', 'running', 'done', 'failed']
|
||||||
progress: integer('progress').default(0), // 0–100
|
})
|
||||||
totalFiles: integer('total_files').default(0),
|
.notNull()
|
||||||
processedFiles: integer('processed_files').default(0),
|
.default('queued'),
|
||||||
error: text('error'),
|
progress: integer('progress').default(0), // 0–100
|
||||||
startedAt: integer('started_at', { mode: 'timestamp' }),
|
totalFiles: integer('total_files').default(0),
|
||||||
completedAt: integer('completed_at', { mode: 'timestamp' }),
|
processedFiles: integer('processed_files').default(0),
|
||||||
createdAt: integer('created_at', { mode: 'timestamp' }).notNull(),
|
error: text('error'),
|
||||||
|
startedAt: integer('started_at', { mode: 'timestamp' }),
|
||||||
|
completedAt: integer('completed_at', { mode: 'timestamp' }),
|
||||||
|
createdAt: integer('created_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -159,17 +169,19 @@ Stores parsed `trueref.json` / `context7.json` configuration.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const repositoryConfigs = sqliteTable('repository_configs', {
|
export const repositoryConfigs = sqliteTable('repository_configs', {
|
||||||
repositoryId: text('repository_id').primaryKey()
|
repositoryId: text('repository_id')
|
||||||
.references(() => repositories.id, { onDelete: 'cascade' }),
|
.primaryKey()
|
||||||
projectTitle: text('project_title'),
|
.references(() => repositories.id, { onDelete: 'cascade' }),
|
||||||
description: text('description'),
|
projectTitle: text('project_title'),
|
||||||
folders: text('folders', { mode: 'json' }).$type<string[]>(),
|
description: text('description'),
|
||||||
excludeFolders: text('exclude_folders', { mode: 'json' }).$type<string[]>(),
|
folders: text('folders', { mode: 'json' }).$type<string[]>(),
|
||||||
excludeFiles: text('exclude_files', { mode: 'json' }).$type<string[]>(),
|
excludeFolders: text('exclude_folders', { mode: 'json' }).$type<string[]>(),
|
||||||
rules: text('rules', { mode: 'json' }).$type<string[]>(),
|
excludeFiles: text('exclude_files', { mode: 'json' }).$type<string[]>(),
|
||||||
previousVersions: text('previous_versions', { mode: 'json' })
|
rules: text('rules', { mode: 'json' }).$type<string[]>(),
|
||||||
.$type<{ tag: string; title: string }[]>(),
|
previousVersions: text('previous_versions', { mode: 'json' }).$type<
|
||||||
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull(),
|
{ tag: string; title: string }[]
|
||||||
|
>(),
|
||||||
|
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -179,9 +191,9 @@ Key-value store for global application settings.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export const settings = sqliteTable('settings', {
|
export const settings = sqliteTable('settings', {
|
||||||
key: text('key').primaryKey(),
|
key: text('key').primaryKey(),
|
||||||
value: text('value', { mode: 'json' }),
|
value: text('value', { mode: 'json' }),
|
||||||
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull(),
|
updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull()
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -31,10 +31,12 @@ Implement the core `RepositoryService` that handles CRUD operations for reposito
|
|||||||
## Repository ID Generation
|
## Repository ID Generation
|
||||||
|
|
||||||
GitHub repositories:
|
GitHub repositories:
|
||||||
|
|
||||||
- Input URL: `https://github.com/facebook/react` or `github.com/facebook/react`
|
- Input URL: `https://github.com/facebook/react` or `github.com/facebook/react`
|
||||||
- Generated ID: `/facebook/react`
|
- Generated ID: `/facebook/react`
|
||||||
|
|
||||||
Local repositories:
|
Local repositories:
|
||||||
|
|
||||||
- Input path: `/home/user/projects/my-sdk`
|
- Input path: `/home/user/projects/my-sdk`
|
||||||
- Generated ID: `/local/my-sdk` (basename of path, slugified)
|
- Generated ID: `/local/my-sdk` (basename of path, slugified)
|
||||||
- Collision resolution: append `-2`, `-3`, etc.
|
- Collision resolution: append `-2`, `-3`, etc.
|
||||||
@@ -49,44 +51,44 @@ Version-specific IDs: `/facebook/react/v18.3.0`
|
|||||||
// src/lib/server/services/repository.service.ts
|
// src/lib/server/services/repository.service.ts
|
||||||
|
|
||||||
export interface AddRepositoryInput {
|
export interface AddRepositoryInput {
|
||||||
source: 'github' | 'local';
|
source: 'github' | 'local';
|
||||||
sourceUrl: string; // GitHub URL or absolute local path
|
sourceUrl: string; // GitHub URL or absolute local path
|
||||||
title?: string; // override auto-detected title
|
title?: string; // override auto-detected title
|
||||||
description?: string;
|
description?: string;
|
||||||
branch?: string; // GitHub: default branch; Local: n/a
|
branch?: string; // GitHub: default branch; Local: n/a
|
||||||
githubToken?: string; // for private GitHub repos
|
githubToken?: string; // for private GitHub repos
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface UpdateRepositoryInput {
|
export interface UpdateRepositoryInput {
|
||||||
title?: string;
|
title?: string;
|
||||||
description?: string;
|
description?: string;
|
||||||
branch?: string;
|
branch?: string;
|
||||||
githubToken?: string;
|
githubToken?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
export class RepositoryService {
|
export class RepositoryService {
|
||||||
constructor(private db: BetterSQLite3.Database) {}
|
constructor(private db: BetterSQLite3.Database) {}
|
||||||
|
|
||||||
async list(options?: {
|
async list(options?: {
|
||||||
state?: Repository['state'];
|
state?: Repository['state'];
|
||||||
limit?: number;
|
limit?: number;
|
||||||
offset?: number;
|
offset?: number;
|
||||||
}): Promise<Repository[]>
|
}): Promise<Repository[]>;
|
||||||
|
|
||||||
async get(id: string): Promise<Repository | null>
|
async get(id: string): Promise<Repository | null>;
|
||||||
|
|
||||||
async add(input: AddRepositoryInput): Promise<Repository>
|
async add(input: AddRepositoryInput): Promise<Repository>;
|
||||||
|
|
||||||
async update(id: string, input: UpdateRepositoryInput): Promise<Repository>
|
async update(id: string, input: UpdateRepositoryInput): Promise<Repository>;
|
||||||
|
|
||||||
async remove(id: string): Promise<void>
|
async remove(id: string): Promise<void>;
|
||||||
|
|
||||||
async getStats(id: string): Promise<{
|
async getStats(id: string): Promise<{
|
||||||
totalSnippets: number;
|
totalSnippets: number;
|
||||||
totalTokens: number;
|
totalTokens: number;
|
||||||
totalDocuments: number;
|
totalDocuments: number;
|
||||||
lastIndexedAt: Date | null;
|
lastIndexedAt: Date | null;
|
||||||
}>
|
}>;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -97,48 +99,52 @@ export class RepositoryService {
|
|||||||
### `GET /api/v1/libs`
|
### `GET /api/v1/libs`
|
||||||
|
|
||||||
Query parameters:
|
Query parameters:
|
||||||
|
|
||||||
- `state` (optional): filter by state (`pending`, `indexed`, `error`, etc.)
|
- `state` (optional): filter by state (`pending`, `indexed`, `error`, etc.)
|
||||||
- `limit` (optional, default 50): max results
|
- `limit` (optional, default 50): max results
|
||||||
- `offset` (optional, default 0): pagination offset
|
- `offset` (optional, default 0): pagination offset
|
||||||
|
|
||||||
Response `200`:
|
Response `200`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"libraries": [
|
"libraries": [
|
||||||
{
|
{
|
||||||
"id": "/facebook/react",
|
"id": "/facebook/react",
|
||||||
"title": "React",
|
"title": "React",
|
||||||
"description": "...",
|
"description": "...",
|
||||||
"source": "github",
|
"source": "github",
|
||||||
"state": "indexed",
|
"state": "indexed",
|
||||||
"totalSnippets": 1234,
|
"totalSnippets": 1234,
|
||||||
"totalTokens": 98000,
|
"totalTokens": 98000,
|
||||||
"trustScore": 8.5,
|
"trustScore": 8.5,
|
||||||
"stars": 228000,
|
"stars": 228000,
|
||||||
"lastIndexedAt": "2026-03-22T10:00:00Z",
|
"lastIndexedAt": "2026-03-22T10:00:00Z",
|
||||||
"versions": ["v18.3.0", "v17.0.2"]
|
"versions": ["v18.3.0", "v17.0.2"]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"total": 12,
|
"total": 12,
|
||||||
"limit": 50,
|
"limit": 50,
|
||||||
"offset": 0
|
"offset": 0
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### `POST /api/v1/libs`
|
### `POST /api/v1/libs`
|
||||||
|
|
||||||
Request body:
|
Request body:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"source": "github",
|
"source": "github",
|
||||||
"sourceUrl": "https://github.com/facebook/react",
|
"sourceUrl": "https://github.com/facebook/react",
|
||||||
"branch": "main",
|
"branch": "main",
|
||||||
"githubToken": "ghp_...",
|
"githubToken": "ghp_...",
|
||||||
"autoIndex": true
|
"autoIndex": true
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Response `201`:
|
Response `201`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"library": { ...Repository },
|
"library": { ...Repository },
|
||||||
@@ -149,6 +155,7 @@ Response `201`:
|
|||||||
`autoIndex: true` (default) immediately queues an indexing job.
|
`autoIndex: true` (default) immediately queues an indexing job.
|
||||||
|
|
||||||
Response `409` if repository already exists:
|
Response `409` if repository already exists:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{ "error": "Repository /facebook/react already exists" }
|
{ "error": "Repository /facebook/react already exists" }
|
||||||
```
|
```
|
||||||
@@ -176,20 +183,22 @@ Response `404`: not found.
|
|||||||
Triggers a new indexing job. If a job is already running for this repo, returns the existing job.
|
Triggers a new indexing job. If a job is already running for this repo, returns the existing job.
|
||||||
|
|
||||||
Request body (optional):
|
Request body (optional):
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{ "version": "v18.3.0" }
|
{ "version": "v18.3.0" }
|
||||||
```
|
```
|
||||||
|
|
||||||
Response `202`:
|
Response `202`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"job": {
|
"job": {
|
||||||
"id": "uuid",
|
"id": "uuid",
|
||||||
"repositoryId": "/facebook/react",
|
"repositoryId": "/facebook/react",
|
||||||
"status": "queued",
|
"status": "queued",
|
||||||
"progress": 0,
|
"progress": 0,
|
||||||
"createdAt": "2026-03-22T10:00:00Z"
|
"createdAt": "2026-03-22T10:00:00Z"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -198,15 +207,17 @@ Response `202`:
|
|||||||
## Error Response Shape
|
## Error Response Shape
|
||||||
|
|
||||||
All error responses follow:
|
All error responses follow:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"error": "Human-readable message",
|
"error": "Human-readable message",
|
||||||
"code": "MACHINE_READABLE_CODE",
|
"code": "MACHINE_READABLE_CODE",
|
||||||
"details": {}
|
"details": {}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Error codes:
|
Error codes:
|
||||||
|
|
||||||
- `NOT_FOUND`
|
- `NOT_FOUND`
|
||||||
- `ALREADY_EXISTS`
|
- `ALREADY_EXISTS`
|
||||||
- `INVALID_INPUT`
|
- `INVALID_INPUT`
|
||||||
@@ -219,23 +230,23 @@ Error codes:
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function resolveGitHubId(url: string): string {
|
function resolveGitHubId(url: string): string {
|
||||||
// Parse owner/repo from URL variants:
|
// Parse owner/repo from URL variants:
|
||||||
// https://github.com/facebook/react
|
// https://github.com/facebook/react
|
||||||
// https://github.com/facebook/react.git
|
// https://github.com/facebook/react.git
|
||||||
// github.com/facebook/react
|
// github.com/facebook/react
|
||||||
const match = url.match(/github\.com\/([^/]+)\/([^/\s.]+)/);
|
const match = url.match(/github\.com\/([^/]+)\/([^/\s.]+)/);
|
||||||
if (!match) throw new Error('Invalid GitHub URL');
|
if (!match) throw new Error('Invalid GitHub URL');
|
||||||
return `/${match[1]}/${match[2]}`;
|
return `/${match[1]}/${match[2]}`;
|
||||||
}
|
}
|
||||||
|
|
||||||
function resolveLocalId(path: string, existingIds: string[]): string {
|
function resolveLocalId(path: string, existingIds: string[]): string {
|
||||||
const base = slugify(path.split('/').at(-1)!);
|
const base = slugify(path.split('/').at(-1)!);
|
||||||
let id = `/local/${base}`;
|
let id = `/local/${base}`;
|
||||||
let counter = 2;
|
let counter = 2;
|
||||||
while (existingIds.includes(id)) {
|
while (existingIds.includes(id)) {
|
||||||
id = `/local/${base}-${counter++}`;
|
id = `/local/${base}-${counter++}`;
|
||||||
}
|
}
|
||||||
return id;
|
return id;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -37,17 +37,46 @@ The crawler only downloads files with these extensions:
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const INDEXABLE_EXTENSIONS = new Set([
|
const INDEXABLE_EXTENSIONS = new Set([
|
||||||
// Documentation
|
// Documentation
|
||||||
'.md', '.mdx', '.txt', '.rst',
|
'.md',
|
||||||
// Code
|
'.mdx',
|
||||||
'.ts', '.tsx', '.js', '.jsx',
|
'.txt',
|
||||||
'.py', '.rb', '.go', '.rs', '.java', '.cs', '.cpp', '.c', '.h',
|
'.rst',
|
||||||
'.swift', '.kt', '.php', '.scala', '.clj', '.ex', '.exs',
|
// Code
|
||||||
'.sh', '.bash', '.zsh', '.fish',
|
'.ts',
|
||||||
// Config / data
|
'.tsx',
|
||||||
'.json', '.yaml', '.yml', '.toml',
|
'.js',
|
||||||
// Web
|
'.jsx',
|
||||||
'.html', '.css', '.svelte', '.vue',
|
'.py',
|
||||||
|
'.rb',
|
||||||
|
'.go',
|
||||||
|
'.rs',
|
||||||
|
'.java',
|
||||||
|
'.cs',
|
||||||
|
'.cpp',
|
||||||
|
'.c',
|
||||||
|
'.h',
|
||||||
|
'.swift',
|
||||||
|
'.kt',
|
||||||
|
'.php',
|
||||||
|
'.scala',
|
||||||
|
'.clj',
|
||||||
|
'.ex',
|
||||||
|
'.exs',
|
||||||
|
'.sh',
|
||||||
|
'.bash',
|
||||||
|
'.zsh',
|
||||||
|
'.fish',
|
||||||
|
// Config / data
|
||||||
|
'.json',
|
||||||
|
'.yaml',
|
||||||
|
'.yml',
|
||||||
|
'.toml',
|
||||||
|
// Web
|
||||||
|
'.html',
|
||||||
|
'.css',
|
||||||
|
'.svelte',
|
||||||
|
'.vue'
|
||||||
]);
|
]);
|
||||||
|
|
||||||
const MAX_FILE_SIZE_BYTES = 500_000; // 500 KB — skip large generated files
|
const MAX_FILE_SIZE_BYTES = 500_000; // 500 KB — skip large generated files
|
||||||
@@ -59,28 +88,28 @@ const MAX_FILE_SIZE_BYTES = 500_000; // 500 KB — skip large generated files
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface CrawledFile {
|
export interface CrawledFile {
|
||||||
path: string; // relative path within repo, e.g. "src/index.ts"
|
path: string; // relative path within repo, e.g. "src/index.ts"
|
||||||
content: string; // UTF-8 file content
|
content: string; // UTF-8 file content
|
||||||
size: number; // bytes
|
size: number; // bytes
|
||||||
sha: string; // GitHub blob SHA (used as checksum)
|
sha: string; // GitHub blob SHA (used as checksum)
|
||||||
language: string; // detected from extension
|
language: string; // detected from extension
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface CrawlResult {
|
export interface CrawlResult {
|
||||||
files: CrawledFile[];
|
files: CrawledFile[];
|
||||||
totalFiles: number; // files matching filters
|
totalFiles: number; // files matching filters
|
||||||
skippedFiles: number; // filtered out or too large
|
skippedFiles: number; // filtered out or too large
|
||||||
branch: string; // branch/tag that was crawled
|
branch: string; // branch/tag that was crawled
|
||||||
commitSha: string; // HEAD commit SHA
|
commitSha: string; // HEAD commit SHA
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface CrawlOptions {
|
export interface CrawlOptions {
|
||||||
owner: string;
|
owner: string;
|
||||||
repo: string;
|
repo: string;
|
||||||
ref?: string; // branch, tag, or commit SHA; defaults to repo default branch
|
ref?: string; // branch, tag, or commit SHA; defaults to repo default branch
|
||||||
token?: string; // GitHub PAT for private repos
|
token?: string; // GitHub PAT for private repos
|
||||||
config?: RepoConfig; // parsed trueref.json
|
config?: RepoConfig; // parsed trueref.json
|
||||||
onProgress?: (processed: number, total: number) => void;
|
onProgress?: (processed: number, total: number) => void;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -89,12 +118,14 @@ export interface CrawlOptions {
|
|||||||
## GitHub API Usage
|
## GitHub API Usage
|
||||||
|
|
||||||
### Step 1: Get default branch (if ref not specified)
|
### Step 1: Get default branch (if ref not specified)
|
||||||
|
|
||||||
```
|
```
|
||||||
GET https://api.github.com/repos/{owner}/{repo}
|
GET https://api.github.com/repos/{owner}/{repo}
|
||||||
→ { default_branch: "main", stargazers_count: 12345 }
|
→ { default_branch: "main", stargazers_count: 12345 }
|
||||||
```
|
```
|
||||||
|
|
||||||
### Step 2: Fetch file tree (recursive)
|
### Step 2: Fetch file tree (recursive)
|
||||||
|
|
||||||
```
|
```
|
||||||
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{ref}?recursive=1
|
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{ref}?recursive=1
|
||||||
→ {
|
→ {
|
||||||
@@ -109,12 +140,14 @@ GET https://api.github.com/repos/{owner}/{repo}/git/trees/{ref}?recursive=1
|
|||||||
If `truncated: true`, the tree has >100k items. Use `--depth` pagination or filter top-level directories first.
|
If `truncated: true`, the tree has >100k items. Use `--depth` pagination or filter top-level directories first.
|
||||||
|
|
||||||
### Step 3: Download file contents (parallel)
|
### Step 3: Download file contents (parallel)
|
||||||
|
|
||||||
```
|
```
|
||||||
GET https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={ref}
|
GET https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={ref}
|
||||||
→ { content: "<base64>", encoding: "base64", size: 1234, sha: "abc123" }
|
→ { content: "<base64>", encoding: "base64", size: 1234, sha: "abc123" }
|
||||||
```
|
```
|
||||||
|
|
||||||
Alternative for large repos: use raw content URL:
|
Alternative for large repos: use raw content URL:
|
||||||
|
|
||||||
```
|
```
|
||||||
GET https://raw.githubusercontent.com/{owner}/{repo}/{ref}/{path}
|
GET https://raw.githubusercontent.com/{owner}/{repo}/{ref}/{path}
|
||||||
```
|
```
|
||||||
@@ -124,48 +157,47 @@ GET https://raw.githubusercontent.com/{owner}/{repo}/{ref}/{path}
|
|||||||
## Filtering Logic
|
## Filtering Logic
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function shouldIndexFile(
|
function shouldIndexFile(filePath: string, fileSize: number, config?: RepoConfig): boolean {
|
||||||
filePath: string,
|
const ext = path.extname(filePath).toLowerCase();
|
||||||
fileSize: number,
|
const base = path.basename(filePath);
|
||||||
config?: RepoConfig
|
|
||||||
): boolean {
|
|
||||||
const ext = path.extname(filePath).toLowerCase();
|
|
||||||
const base = path.basename(filePath);
|
|
||||||
|
|
||||||
// 1. Must have indexable extension
|
// 1. Must have indexable extension
|
||||||
if (!INDEXABLE_EXTENSIONS.has(ext)) return false;
|
if (!INDEXABLE_EXTENSIONS.has(ext)) return false;
|
||||||
|
|
||||||
// 2. Must not exceed size limit
|
// 2. Must not exceed size limit
|
||||||
if (fileSize > MAX_FILE_SIZE_BYTES) return false;
|
if (fileSize > MAX_FILE_SIZE_BYTES) return false;
|
||||||
|
|
||||||
// 3. Exclude lockfiles and other non-source artifacts
|
// 3. Exclude lockfiles and other non-source artifacts
|
||||||
if (IGNORED_FILE_NAMES.has(base)) return false;
|
if (IGNORED_FILE_NAMES.has(base)) return false;
|
||||||
|
|
||||||
// 4. Exclude minified and bundled assets
|
// 4. Exclude minified and bundled assets
|
||||||
if (base.includes('.min.') || base.endsWith('.bundle.js') || base.endsWith('.bundle.css')) {
|
if (base.includes('.min.') || base.endsWith('.bundle.js') || base.endsWith('.bundle.css')) {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
// 5. Apply config excludeFiles (exact filename match)
|
// 5. Apply config excludeFiles (exact filename match)
|
||||||
if (config?.excludeFiles?.includes(base)) return false;
|
if (config?.excludeFiles?.includes(base)) return false;
|
||||||
|
|
||||||
// 6. Exclude common dependency/build/cache directories at any depth
|
// 6. Exclude common dependency/build/cache directories at any depth
|
||||||
if (isInIgnoredDirectory(filePath)) return false;
|
if (isInIgnoredDirectory(filePath)) return false;
|
||||||
|
|
||||||
// 7. Apply config excludeFolders (regex or prefix match)
|
// 7. Apply config excludeFolders (regex or prefix match)
|
||||||
if (config?.excludeFolders?.some(folder =>
|
if (
|
||||||
filePath.startsWith(folder) || new RegExp(folder).test(filePath)
|
config?.excludeFolders?.some(
|
||||||
)) return false;
|
(folder) => filePath.startsWith(folder) || new RegExp(folder).test(filePath)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return false;
|
||||||
|
|
||||||
// 8. Apply config folders allowlist (if specified, only index those paths)
|
// 8. Apply config folders allowlist (if specified, only index those paths)
|
||||||
if (config?.folders?.length) {
|
if (config?.folders?.length) {
|
||||||
const inAllowedFolder = config.folders.some(folder =>
|
const inAllowedFolder = config.folders.some(
|
||||||
filePath.startsWith(folder) || new RegExp(folder).test(filePath)
|
(folder) => filePath.startsWith(folder) || new RegExp(folder).test(filePath)
|
||||||
);
|
);
|
||||||
if (!inAllowedFolder) return false;
|
if (!inAllowedFolder) return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -177,20 +209,20 @@ The shared ignored-directory list is intentionally broader than the original bas
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
class GitHubRateLimiter {
|
class GitHubRateLimiter {
|
||||||
private remaining = 5000;
|
private remaining = 5000;
|
||||||
private resetAt = Date.now();
|
private resetAt = Date.now();
|
||||||
|
|
||||||
updateFromHeaders(headers: Headers): void {
|
updateFromHeaders(headers: Headers): void {
|
||||||
this.remaining = parseInt(headers.get('X-RateLimit-Remaining') ?? '5000');
|
this.remaining = parseInt(headers.get('X-RateLimit-Remaining') ?? '5000');
|
||||||
this.resetAt = parseInt(headers.get('X-RateLimit-Reset') ?? '0') * 1000;
|
this.resetAt = parseInt(headers.get('X-RateLimit-Reset') ?? '0') * 1000;
|
||||||
}
|
}
|
||||||
|
|
||||||
async waitIfNeeded(): Promise<void> {
|
async waitIfNeeded(): Promise<void> {
|
||||||
if (this.remaining <= 10) {
|
if (this.remaining <= 10) {
|
||||||
const waitMs = Math.max(0, this.resetAt - Date.now()) + 1000;
|
const waitMs = Math.max(0, this.resetAt - Date.now()) + 1000;
|
||||||
await sleep(waitMs);
|
await sleep(waitMs);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -200,14 +232,14 @@ Requests are made with a concurrency limit of 10 parallel downloads using a sema
|
|||||||
|
|
||||||
## Error Handling
|
## Error Handling
|
||||||
|
|
||||||
| Scenario | Behavior |
|
| Scenario | Behavior |
|
||||||
|----------|---------|
|
| ------------------------- | --------------------------------------------------------------------------- |
|
||||||
| 404 Not Found | Throw `RepositoryNotFoundError` |
|
| 404 Not Found | Throw `RepositoryNotFoundError` |
|
||||||
| 401 Unauthorized | Throw `AuthenticationError` (invalid or missing token) |
|
| 401 Unauthorized | Throw `AuthenticationError` (invalid or missing token) |
|
||||||
| 403 Forbidden | If `X-RateLimit-Remaining: 0`, wait and retry; else throw `PermissionError` |
|
| 403 Forbidden | If `X-RateLimit-Remaining: 0`, wait and retry; else throw `PermissionError` |
|
||||||
| 422 Unprocessable | Tree too large; switch to directory-by-directory traversal |
|
| 422 Unprocessable | Tree too large; switch to directory-by-directory traversal |
|
||||||
| Network error | Retry up to 3 times with exponential backoff |
|
| Network error | Retry up to 3 times with exponential backoff |
|
||||||
| File content decode error | Skip file, log warning |
|
| File content decode error | Skip file, log warning |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -38,9 +38,9 @@ Reuses `CrawledFile` and `CrawlResult` from TRUEREF-0003 crawler types:
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface LocalCrawlOptions {
|
export interface LocalCrawlOptions {
|
||||||
rootPath: string; // absolute path to repository root
|
rootPath: string; // absolute path to repository root
|
||||||
config?: RepoConfig; // parsed trueref.json
|
config?: RepoConfig; // parsed trueref.json
|
||||||
onProgress?: (processed: number, total: number) => void;
|
onProgress?: (processed: number, total: number) => void;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -50,75 +50,73 @@ export interface LocalCrawlOptions {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export class LocalCrawler {
|
export class LocalCrawler {
|
||||||
async crawl(options: LocalCrawlOptions): Promise<CrawlResult> {
|
async crawl(options: LocalCrawlOptions): Promise<CrawlResult> {
|
||||||
// 1. Load root .gitignore if present
|
// 1. Load root .gitignore if present
|
||||||
const gitignore = await this.loadGitignore(options.rootPath);
|
const gitignore = await this.loadGitignore(options.rootPath);
|
||||||
|
|
||||||
// 2. Enumerate files recursively, pruning ignored directories early
|
// 2. Enumerate files recursively, pruning ignored directories early
|
||||||
const allFiles = await this.walkDirectory(options.rootPath, '', gitignore);
|
const allFiles = await this.walkDirectory(options.rootPath, '', gitignore);
|
||||||
|
|
||||||
// 3. Look for trueref.json / context7.json first
|
// 3. Look for trueref.json / context7.json first
|
||||||
const configFile = allFiles.find(f =>
|
const configFile = allFiles.find((f) => f === 'trueref.json' || f === 'context7.json');
|
||||||
f === 'trueref.json' || f === 'context7.json'
|
let config = options.config;
|
||||||
);
|
if (configFile && !config) {
|
||||||
let config = options.config;
|
config = await this.parseConfigFile(path.join(options.rootPath, configFile));
|
||||||
if (configFile && !config) {
|
}
|
||||||
config = await this.parseConfigFile(
|
|
||||||
path.join(options.rootPath, configFile)
|
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
// 4. Filter files
|
// 4. Filter files
|
||||||
const filteredFiles = allFiles.filter(relPath => {
|
const filteredFiles = allFiles.filter((relPath) => {
|
||||||
const stat = statCache.get(relPath);
|
const stat = statCache.get(relPath);
|
||||||
return shouldIndexFile(relPath, stat.size, config);
|
return shouldIndexFile(relPath, stat.size, config);
|
||||||
});
|
});
|
||||||
|
|
||||||
// 5. Read and return file contents
|
// 5. Read and return file contents
|
||||||
const crawledFiles: CrawledFile[] = [];
|
const crawledFiles: CrawledFile[] = [];
|
||||||
for (const [i, relPath] of filteredFiles.entries()) {
|
for (const [i, relPath] of filteredFiles.entries()) {
|
||||||
const absPath = path.join(options.rootPath, relPath);
|
const absPath = path.join(options.rootPath, relPath);
|
||||||
const content = await fs.readFile(absPath, 'utf-8');
|
const content = await fs.readFile(absPath, 'utf-8');
|
||||||
const sha = computeSHA256(content);
|
const sha = computeSHA256(content);
|
||||||
crawledFiles.push({
|
crawledFiles.push({
|
||||||
path: relPath,
|
path: relPath,
|
||||||
content,
|
content,
|
||||||
size: Buffer.byteLength(content, 'utf-8'),
|
size: Buffer.byteLength(content, 'utf-8'),
|
||||||
sha,
|
sha,
|
||||||
language: detectLanguage(relPath),
|
language: detectLanguage(relPath)
|
||||||
});
|
});
|
||||||
options.onProgress?.(i + 1, filteredFiles.length);
|
options.onProgress?.(i + 1, filteredFiles.length);
|
||||||
}
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
files: crawledFiles,
|
files: crawledFiles,
|
||||||
totalFiles: filteredFiles.length,
|
totalFiles: filteredFiles.length,
|
||||||
skippedFiles: allFiles.length - filteredFiles.length,
|
skippedFiles: allFiles.length - filteredFiles.length,
|
||||||
branch: 'local',
|
branch: 'local',
|
||||||
commitSha: computeSHA256(crawledFiles.map(f => f.sha).join('')),
|
commitSha: computeSHA256(crawledFiles.map((f) => f.sha).join(''))
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
private async walkDirectory(dir: string, rel = '', gitignore?: GitignoreFilter): Promise<string[]> {
|
private async walkDirectory(
|
||||||
const entries = await fs.readdir(dir, { withFileTypes: true });
|
dir: string,
|
||||||
const files: string[] = [];
|
rel = '',
|
||||||
for (const entry of entries) {
|
gitignore?: GitignoreFilter
|
||||||
if (!entry.isFile() && !entry.isDirectory()) continue; // skip symlinks, devices
|
): Promise<string[]> {
|
||||||
const relPath = rel ? `${rel}/${entry.name}` : entry.name;
|
const entries = await fs.readdir(dir, { withFileTypes: true });
|
||||||
if (entry.isDirectory()) {
|
const files: string[] = [];
|
||||||
if (shouldPruneDirectory(relPath) || gitignore?.isIgnored(relPath, true)) {
|
for (const entry of entries) {
|
||||||
continue;
|
if (!entry.isFile() && !entry.isDirectory()) continue; // skip symlinks, devices
|
||||||
}
|
const relPath = rel ? `${rel}/${entry.name}` : entry.name;
|
||||||
files.push(...await this.walkDirectory(
|
if (entry.isDirectory()) {
|
||||||
path.join(dir, entry.name), relPath, gitignore
|
if (shouldPruneDirectory(relPath) || gitignore?.isIgnored(relPath, true)) {
|
||||||
));
|
continue;
|
||||||
} else {
|
}
|
||||||
if (gitignore?.isIgnored(relPath, false)) continue;
|
files.push(...(await this.walkDirectory(path.join(dir, entry.name), relPath, gitignore)));
|
||||||
files.push(relPath);
|
} else {
|
||||||
}
|
if (gitignore?.isIgnored(relPath, false)) continue;
|
||||||
}
|
files.push(relPath);
|
||||||
return files;
|
}
|
||||||
}
|
}
|
||||||
|
return files;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -142,7 +140,7 @@ Directory pruning should happen during the walk so large dependency trees are ne
|
|||||||
import { createHash } from 'crypto';
|
import { createHash } from 'crypto';
|
||||||
|
|
||||||
function computeSHA256(content: string): string {
|
function computeSHA256(content: string): string {
|
||||||
return createHash('sha256').update(content, 'utf-8').digest('hex');
|
return createHash('sha256').update(content, 'utf-8').digest('hex');
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -30,19 +30,19 @@ Implement the document parsing and chunking pipeline that transforms raw file co
|
|||||||
|
|
||||||
## Supported File Types
|
## Supported File Types
|
||||||
|
|
||||||
| Extension | Parser Strategy |
|
| Extension | Parser Strategy |
|
||||||
|-----------|----------------|
|
| --------------------------------- | ------------------------------------------------------- |
|
||||||
| `.md`, `.mdx` | Heading-based section splitting + code block extraction |
|
| `.md`, `.mdx` | Heading-based section splitting + code block extraction |
|
||||||
| `.txt`, `.rst` | Paragraph-based splitting |
|
| `.txt`, `.rst` | Paragraph-based splitting |
|
||||||
| `.ts`, `.tsx`, `.js`, `.jsx` | AST-free: function/class boundary detection via regex |
|
| `.ts`, `.tsx`, `.js`, `.jsx` | AST-free: function/class boundary detection via regex |
|
||||||
| `.py` | `def`/`class` boundary detection |
|
| `.py` | `def`/`class` boundary detection |
|
||||||
| `.go` | `func`/`type` boundary detection |
|
| `.go` | `func`/`type` boundary detection |
|
||||||
| `.rs` | `fn`/`impl`/`struct` boundary detection |
|
| `.rs` | `fn`/`impl`/`struct` boundary detection |
|
||||||
| `.java`, `.cs`, `.kt`, `.swift` | Class/method boundary detection |
|
| `.java`, `.cs`, `.kt`, `.swift` | Class/method boundary detection |
|
||||||
| `.rb` | `def`/`class` boundary detection |
|
| `.rb` | `def`/`class` boundary detection |
|
||||||
| `.json`, `.yaml`, `.yml`, `.toml` | Structural chunking (top-level keys) |
|
| `.json`, `.yaml`, `.yml`, `.toml` | Structural chunking (top-level keys) |
|
||||||
| `.html`, `.svelte`, `.vue` | Text content extraction + script block splitting |
|
| `.html`, `.svelte`, `.vue` | Text content extraction + script block splitting |
|
||||||
| Other code | Line-count-based sliding window (200 lines per chunk) |
|
| Other code | Line-count-based sliding window (200 lines per chunk) |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -52,9 +52,9 @@ Use a simple character-based approximation (no tokenizer library needed for v1):
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function estimateTokens(text: string): number {
|
function estimateTokens(text: string): number {
|
||||||
// Empirically: ~4 chars per token for English prose
|
// Empirically: ~4 chars per token for English prose
|
||||||
// ~3 chars per token for code (more symbols)
|
// ~3 chars per token for code (more symbols)
|
||||||
return Math.ceil(text.length / 3.5);
|
return Math.ceil(text.length / 3.5);
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -74,49 +74,49 @@ The Markdown parser is the most important parser as most documentation is Markdo
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface MarkdownSection {
|
interface MarkdownSection {
|
||||||
headings: string[]; // heading stack at this point
|
headings: string[]; // heading stack at this point
|
||||||
content: string; // text content (sans code blocks)
|
content: string; // text content (sans code blocks)
|
||||||
codeBlocks: { language: string; code: string }[];
|
codeBlocks: { language: string; code: string }[];
|
||||||
}
|
}
|
||||||
|
|
||||||
function parseMarkdown(content: string, filePath: string): Snippet[] {
|
function parseMarkdown(content: string, filePath: string): Snippet[] {
|
||||||
const sections = splitIntoSections(content);
|
const sections = splitIntoSections(content);
|
||||||
const snippets: Snippet[] = [];
|
const snippets: Snippet[] = [];
|
||||||
|
|
||||||
for (const section of sections) {
|
for (const section of sections) {
|
||||||
const breadcrumb = section.headings.join(' > ');
|
const breadcrumb = section.headings.join(' > ');
|
||||||
const title = section.headings.at(-1) ?? path.basename(filePath);
|
const title = section.headings.at(-1) ?? path.basename(filePath);
|
||||||
|
|
||||||
// Emit info snippet for text content
|
// Emit info snippet for text content
|
||||||
if (section.content.trim().length >= 20) {
|
if (section.content.trim().length >= 20) {
|
||||||
const chunks = chunkText(section.content, MAX_TOKENS, OVERLAP_TOKENS);
|
const chunks = chunkText(section.content, MAX_TOKENS, OVERLAP_TOKENS);
|
||||||
for (const chunk of chunks) {
|
for (const chunk of chunks) {
|
||||||
snippets.push({
|
snippets.push({
|
||||||
type: 'info',
|
type: 'info',
|
||||||
title,
|
title,
|
||||||
content: chunk,
|
content: chunk,
|
||||||
breadcrumb,
|
breadcrumb,
|
||||||
tokenCount: estimateTokens(chunk),
|
tokenCount: estimateTokens(chunk)
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Emit code snippets for each code block
|
// Emit code snippets for each code block
|
||||||
for (const block of section.codeBlocks) {
|
for (const block of section.codeBlocks) {
|
||||||
if (block.code.trim().length >= 20) {
|
if (block.code.trim().length >= 20) {
|
||||||
snippets.push({
|
snippets.push({
|
||||||
type: 'code',
|
type: 'code',
|
||||||
title,
|
title,
|
||||||
content: block.code,
|
content: block.code,
|
||||||
language: block.language || detectLanguage('.' + block.language),
|
language: block.language || detectLanguage('.' + block.language),
|
||||||
breadcrumb,
|
breadcrumb,
|
||||||
tokenCount: estimateTokens(block.code),
|
tokenCount: estimateTokens(block.code)
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return snippets;
|
return snippets;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -135,43 +135,41 @@ For non-Markdown code files, use regex-based function/class boundary detection.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const BOUNDARY_PATTERNS: Record<string, RegExp> = {
|
const BOUNDARY_PATTERNS: Record<string, RegExp> = {
|
||||||
typescript: /^(export\s+)?(async\s+)?(function|class|interface|type|const|let|var)\s+\w+/m,
|
typescript: /^(export\s+)?(async\s+)?(function|class|interface|type|const|let|var)\s+\w+/m,
|
||||||
python: /^(async\s+)?(def|class)\s+\w+/m,
|
python: /^(async\s+)?(def|class)\s+\w+/m,
|
||||||
go: /^(func|type|var|const)\s+\w+/m,
|
go: /^(func|type|var|const)\s+\w+/m,
|
||||||
rust: /^(pub\s+)?(fn|impl|struct|enum|trait)\s+\w+/m,
|
rust: /^(pub\s+)?(fn|impl|struct|enum|trait)\s+\w+/m,
|
||||||
java: /^(public|private|protected|static).*?(class|interface|enum|void|\w+)\s+\w+\s*[({]/m,
|
java: /^(public|private|protected|static).*?(class|interface|enum|void|\w+)\s+\w+\s*[({]/m
|
||||||
};
|
};
|
||||||
|
|
||||||
function parseCodeFile(
|
function parseCodeFile(content: string, filePath: string, language: string): Snippet[] {
|
||||||
content: string,
|
const pattern = BOUNDARY_PATTERNS[language];
|
||||||
filePath: string,
|
const breadcrumb = filePath;
|
||||||
language: string
|
const title = path.basename(filePath);
|
||||||
): Snippet[] {
|
|
||||||
const pattern = BOUNDARY_PATTERNS[language];
|
|
||||||
const breadcrumb = filePath;
|
|
||||||
const title = path.basename(filePath);
|
|
||||||
|
|
||||||
if (!pattern) {
|
if (!pattern) {
|
||||||
// Fallback: sliding window
|
// Fallback: sliding window
|
||||||
return slidingWindowChunks(content, filePath, language);
|
return slidingWindowChunks(content, filePath, language);
|
||||||
}
|
}
|
||||||
|
|
||||||
const chunks = splitAtBoundaries(content, pattern);
|
const chunks = splitAtBoundaries(content, pattern);
|
||||||
return chunks
|
return chunks
|
||||||
.filter(chunk => chunk.trim().length >= 20)
|
.filter((chunk) => chunk.trim().length >= 20)
|
||||||
.flatMap(chunk => {
|
.flatMap((chunk) => {
|
||||||
if (estimateTokens(chunk) <= MAX_TOKENS) {
|
if (estimateTokens(chunk) <= MAX_TOKENS) {
|
||||||
return [{
|
return [
|
||||||
type: 'code' as const,
|
{
|
||||||
title,
|
type: 'code' as const,
|
||||||
content: chunk,
|
title,
|
||||||
language,
|
content: chunk,
|
||||||
breadcrumb,
|
language,
|
||||||
tokenCount: estimateTokens(chunk),
|
breadcrumb,
|
||||||
}];
|
tokenCount: estimateTokens(chunk)
|
||||||
}
|
}
|
||||||
return slidingWindowChunks(chunk, filePath, language);
|
];
|
||||||
});
|
}
|
||||||
|
return slidingWindowChunks(chunk, filePath, language);
|
||||||
|
});
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -188,27 +186,23 @@ const MIN_CONTENT_LENGTH = 20; // characters
|
|||||||
### Sliding Window Chunker
|
### Sliding Window Chunker
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function chunkText(
|
function chunkText(text: string, maxTokens: number, overlapTokens: number): string[] {
|
||||||
text: string,
|
const words = text.split(/\s+/);
|
||||||
maxTokens: number,
|
const wordsPerToken = 0.75; // ~0.75 words per token
|
||||||
overlapTokens: number
|
const maxWords = Math.floor(maxTokens * wordsPerToken);
|
||||||
): string[] {
|
const overlapWords = Math.floor(overlapTokens * wordsPerToken);
|
||||||
const words = text.split(/\s+/);
|
|
||||||
const wordsPerToken = 0.75; // ~0.75 words per token
|
|
||||||
const maxWords = Math.floor(maxTokens * wordsPerToken);
|
|
||||||
const overlapWords = Math.floor(overlapTokens * wordsPerToken);
|
|
||||||
|
|
||||||
const chunks: string[] = [];
|
const chunks: string[] = [];
|
||||||
let start = 0;
|
let start = 0;
|
||||||
|
|
||||||
while (start < words.length) {
|
while (start < words.length) {
|
||||||
const end = Math.min(start + maxWords, words.length);
|
const end = Math.min(start + maxWords, words.length);
|
||||||
chunks.push(words.slice(start, end).join(' '));
|
chunks.push(words.slice(start, end).join(' '));
|
||||||
if (end === words.length) break;
|
if (end === words.length) break;
|
||||||
start = end - overlapWords;
|
start = end - overlapWords;
|
||||||
}
|
}
|
||||||
|
|
||||||
return chunks;
|
return chunks;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -218,34 +212,42 @@ function chunkText(
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const LANGUAGE_MAP: Record<string, string> = {
|
const LANGUAGE_MAP: Record<string, string> = {
|
||||||
'.ts': 'typescript', '.tsx': 'typescript',
|
'.ts': 'typescript',
|
||||||
'.js': 'javascript', '.jsx': 'javascript',
|
'.tsx': 'typescript',
|
||||||
'.py': 'python',
|
'.js': 'javascript',
|
||||||
'.rb': 'ruby',
|
'.jsx': 'javascript',
|
||||||
'.go': 'go',
|
'.py': 'python',
|
||||||
'.rs': 'rust',
|
'.rb': 'ruby',
|
||||||
'.java': 'java',
|
'.go': 'go',
|
||||||
'.cs': 'csharp',
|
'.rs': 'rust',
|
||||||
'.cpp': 'cpp', '.c': 'c', '.h': 'c',
|
'.java': 'java',
|
||||||
'.swift': 'swift',
|
'.cs': 'csharp',
|
||||||
'.kt': 'kotlin',
|
'.cpp': 'cpp',
|
||||||
'.php': 'php',
|
'.c': 'c',
|
||||||
'.scala': 'scala',
|
'.h': 'c',
|
||||||
'.sh': 'bash', '.bash': 'bash', '.zsh': 'bash',
|
'.swift': 'swift',
|
||||||
'.md': 'markdown', '.mdx': 'markdown',
|
'.kt': 'kotlin',
|
||||||
'.json': 'json',
|
'.php': 'php',
|
||||||
'.yaml': 'yaml', '.yml': 'yaml',
|
'.scala': 'scala',
|
||||||
'.toml': 'toml',
|
'.sh': 'bash',
|
||||||
'.html': 'html',
|
'.bash': 'bash',
|
||||||
'.css': 'css',
|
'.zsh': 'bash',
|
||||||
'.svelte': 'svelte',
|
'.md': 'markdown',
|
||||||
'.vue': 'vue',
|
'.mdx': 'markdown',
|
||||||
'.sql': 'sql',
|
'.json': 'json',
|
||||||
|
'.yaml': 'yaml',
|
||||||
|
'.yml': 'yaml',
|
||||||
|
'.toml': 'toml',
|
||||||
|
'.html': 'html',
|
||||||
|
'.css': 'css',
|
||||||
|
'.svelte': 'svelte',
|
||||||
|
'.vue': 'vue',
|
||||||
|
'.sql': 'sql'
|
||||||
};
|
};
|
||||||
|
|
||||||
function detectLanguage(filePath: string): string {
|
function detectLanguage(filePath: string): string {
|
||||||
const ext = path.extname(filePath).toLowerCase();
|
const ext = path.extname(filePath).toLowerCase();
|
||||||
return LANGUAGE_MAP[ext] ?? 'text';
|
return LANGUAGE_MAP[ext] ?? 'text';
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -255,32 +257,32 @@ function detectLanguage(filePath: string): string {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface ParseOptions {
|
export interface ParseOptions {
|
||||||
repositoryId: string;
|
repositoryId: string;
|
||||||
documentId: string;
|
documentId: string;
|
||||||
versionId?: string;
|
versionId?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
export function parseFile(
|
export function parseFile(file: CrawledFile, options: ParseOptions): NewSnippet[] {
|
||||||
file: CrawledFile,
|
const language = detectLanguage(file.path);
|
||||||
options: ParseOptions
|
let rawSnippets: Omit<
|
||||||
): NewSnippet[] {
|
NewSnippet,
|
||||||
const language = detectLanguage(file.path);
|
'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'
|
||||||
let rawSnippets: Omit<NewSnippet, 'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'>[];
|
>[];
|
||||||
|
|
||||||
if (language === 'markdown') {
|
if (language === 'markdown') {
|
||||||
rawSnippets = parseMarkdown(file.content, file.path);
|
rawSnippets = parseMarkdown(file.content, file.path);
|
||||||
} else {
|
} else {
|
||||||
rawSnippets = parseCodeFile(file.content, file.path, language);
|
rawSnippets = parseCodeFile(file.content, file.path, language);
|
||||||
}
|
}
|
||||||
|
|
||||||
return rawSnippets.map(s => ({
|
return rawSnippets.map((s) => ({
|
||||||
...s,
|
...s,
|
||||||
id: crypto.randomUUID(),
|
id: crypto.randomUUID(),
|
||||||
repositoryId: options.repositoryId,
|
repositoryId: options.repositoryId,
|
||||||
documentId: options.documentId,
|
documentId: options.documentId,
|
||||||
versionId: options.versionId ?? null,
|
versionId: options.versionId ?? null,
|
||||||
createdAt: new Date(),
|
createdAt: new Date()
|
||||||
}));
|
}));
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -33,42 +33,37 @@ Implement the full-text search engine using SQLite's built-in FTS5 extension. Th
|
|||||||
// src/lib/server/search/search.service.ts
|
// src/lib/server/search/search.service.ts
|
||||||
|
|
||||||
export interface SnippetSearchOptions {
|
export interface SnippetSearchOptions {
|
||||||
repositoryId: string;
|
repositoryId: string;
|
||||||
versionId?: string;
|
versionId?: string;
|
||||||
type?: 'code' | 'info';
|
type?: 'code' | 'info';
|
||||||
limit?: number; // default: 20
|
limit?: number; // default: 20
|
||||||
offset?: number; // default: 0
|
offset?: number; // default: 0
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface SnippetSearchResult {
|
export interface SnippetSearchResult {
|
||||||
snippet: Snippet;
|
snippet: Snippet;
|
||||||
score: number; // BM25 rank (negative, lower = better)
|
score: number; // BM25 rank (negative, lower = better)
|
||||||
repository: Pick<Repository, 'id' | 'title'>;
|
repository: Pick<Repository, 'id' | 'title'>;
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface LibrarySearchOptions {
|
export interface LibrarySearchOptions {
|
||||||
libraryName: string;
|
libraryName: string;
|
||||||
query?: string; // semantic relevance hint
|
query?: string; // semantic relevance hint
|
||||||
limit?: number; // default: 10
|
limit?: number; // default: 10
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface LibrarySearchResult {
|
export interface LibrarySearchResult {
|
||||||
repository: Repository;
|
repository: Repository;
|
||||||
versions: RepositoryVersion[];
|
versions: RepositoryVersion[];
|
||||||
score: number; // composite relevance score
|
score: number; // composite relevance score
|
||||||
}
|
}
|
||||||
|
|
||||||
export class SearchService {
|
export class SearchService {
|
||||||
constructor(private db: BetterSQLite3.Database) {}
|
constructor(private db: BetterSQLite3.Database) {}
|
||||||
|
|
||||||
searchSnippets(
|
searchSnippets(query: string, options: SnippetSearchOptions): SnippetSearchResult[];
|
||||||
query: string,
|
|
||||||
options: SnippetSearchOptions
|
|
||||||
): SnippetSearchResult[]
|
|
||||||
|
|
||||||
searchRepositories(
|
searchRepositories(options: LibrarySearchOptions): LibrarySearchResult[];
|
||||||
options: LibrarySearchOptions
|
|
||||||
): LibrarySearchResult[]
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -101,21 +96,21 @@ The FTS5 MATCH query uses the porter stemmer and unicode61 tokenizer (configured
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function preprocessQuery(raw: string): string {
|
function preprocessQuery(raw: string): string {
|
||||||
// 1. Trim and normalize whitespace
|
// 1. Trim and normalize whitespace
|
||||||
let q = raw.trim().replace(/\s+/g, ' ');
|
let q = raw.trim().replace(/\s+/g, ' ');
|
||||||
|
|
||||||
// 2. Escape FTS5 special characters that aren't intended as operators
|
// 2. Escape FTS5 special characters that aren't intended as operators
|
||||||
// Keep: * (prefix), " " (phrase), AND, OR, NOT
|
// Keep: * (prefix), " " (phrase), AND, OR, NOT
|
||||||
q = q.replace(/[()]/g, ' ');
|
q = q.replace(/[()]/g, ' ');
|
||||||
|
|
||||||
// 3. Add prefix wildcard to last token for "typing as you go" feel
|
// 3. Add prefix wildcard to last token for "typing as you go" feel
|
||||||
const tokens = q.split(' ');
|
const tokens = q.split(' ');
|
||||||
const lastToken = tokens.at(-1) ?? '';
|
const lastToken = tokens.at(-1) ?? '';
|
||||||
if (lastToken.length >= 3 && !lastToken.endsWith('*')) {
|
if (lastToken.length >= 3 && !lastToken.endsWith('*')) {
|
||||||
tokens[tokens.length - 1] = lastToken + '*';
|
tokens[tokens.length - 1] = lastToken + '*';
|
||||||
}
|
}
|
||||||
|
|
||||||
return tokens.join(' ');
|
return tokens.join(' ');
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -174,56 +169,65 @@ searchRepositories(options: LibrarySearchOptions): LibrarySearchResult[] {
|
|||||||
The search results must be formatted for the REST API and MCP tool responses:
|
The search results must be formatted for the REST API and MCP tool responses:
|
||||||
|
|
||||||
### Library search response (for `resolve-library-id`):
|
### Library search response (for `resolve-library-id`):
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function formatLibraryResults(results: LibrarySearchResult[]): string {
|
function formatLibraryResults(results: LibrarySearchResult[]): string {
|
||||||
if (results.length === 0) {
|
if (results.length === 0) {
|
||||||
return 'No libraries found matching your search.';
|
return 'No libraries found matching your search.';
|
||||||
}
|
}
|
||||||
|
|
||||||
return results.map((r, i) => {
|
return results
|
||||||
const repo = r.repository;
|
.map((r, i) => {
|
||||||
const versions = r.versions.map(v => v.tag).join(', ') || 'default branch';
|
const repo = r.repository;
|
||||||
return [
|
const versions = r.versions.map((v) => v.tag).join(', ') || 'default branch';
|
||||||
`${i + 1}. ${repo.title}`,
|
return [
|
||||||
` Library ID: ${repo.id}`,
|
`${i + 1}. ${repo.title}`,
|
||||||
` Description: ${repo.description ?? 'No description'}`,
|
` Library ID: ${repo.id}`,
|
||||||
` Snippets: ${repo.totalSnippets} | Trust Score: ${repo.trustScore.toFixed(1)}/10`,
|
` Description: ${repo.description ?? 'No description'}`,
|
||||||
` Available Versions: ${versions}`,
|
` Snippets: ${repo.totalSnippets} | Trust Score: ${repo.trustScore.toFixed(1)}/10`,
|
||||||
].join('\n');
|
` Available Versions: ${versions}`
|
||||||
}).join('\n\n');
|
].join('\n');
|
||||||
|
})
|
||||||
|
.join('\n\n');
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### Snippet search response (for `query-docs`):
|
### Snippet search response (for `query-docs`):
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function formatSnippetResults(
|
function formatSnippetResults(results: SnippetSearchResult[], rules?: string[]): string {
|
||||||
results: SnippetSearchResult[],
|
const parts: string[] = [];
|
||||||
rules?: string[]
|
|
||||||
): string {
|
|
||||||
const parts: string[] = [];
|
|
||||||
|
|
||||||
// Prepend repository rules if present
|
// Prepend repository rules if present
|
||||||
if (rules?.length) {
|
if (rules?.length) {
|
||||||
parts.push('## Library Rules\n' + rules.map(r => `- ${r}`).join('\n'));
|
parts.push('## Library Rules\n' + rules.map((r) => `- ${r}`).join('\n'));
|
||||||
}
|
}
|
||||||
|
|
||||||
for (const { snippet } of results) {
|
for (const { snippet } of results) {
|
||||||
if (snippet.type === 'code') {
|
if (snippet.type === 'code') {
|
||||||
parts.push([
|
parts.push(
|
||||||
snippet.title ? `### ${snippet.title}` : '',
|
[
|
||||||
snippet.breadcrumb ? `*${snippet.breadcrumb}*` : '',
|
snippet.title ? `### ${snippet.title}` : '',
|
||||||
`\`\`\`${snippet.language ?? ''}\n${snippet.content}\n\`\`\``,
|
snippet.breadcrumb ? `*${snippet.breadcrumb}*` : '',
|
||||||
].filter(Boolean).join('\n'));
|
`\`\`\`${snippet.language ?? ''}\n${snippet.content}\n\`\`\``
|
||||||
} else {
|
]
|
||||||
parts.push([
|
.filter(Boolean)
|
||||||
snippet.title ? `### ${snippet.title}` : '',
|
.join('\n')
|
||||||
snippet.breadcrumb ? `*${snippet.breadcrumb}*` : '',
|
);
|
||||||
snippet.content,
|
} else {
|
||||||
].filter(Boolean).join('\n'));
|
parts.push(
|
||||||
}
|
[
|
||||||
}
|
snippet.title ? `### ${snippet.title}` : '',
|
||||||
|
snippet.breadcrumb ? `*${snippet.breadcrumb}*` : '',
|
||||||
|
snippet.content
|
||||||
|
]
|
||||||
|
.filter(Boolean)
|
||||||
|
.join('\n')
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return parts.join('\n\n---\n\n');
|
return parts.join('\n\n---\n\n');
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -235,26 +239,26 @@ Compute `trustScore` (0–10) when a repository is first indexed:
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function computeTrustScore(repo: Repository): number {
|
function computeTrustScore(repo: Repository): number {
|
||||||
let score = 0;
|
let score = 0;
|
||||||
|
|
||||||
// Stars (up to 4 points): log scale, 10k stars = 4 pts
|
// Stars (up to 4 points): log scale, 10k stars = 4 pts
|
||||||
if (repo.stars) {
|
if (repo.stars) {
|
||||||
score += Math.min(4, Math.log10(repo.stars + 1));
|
score += Math.min(4, Math.log10(repo.stars + 1));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Documentation coverage (up to 3 points)
|
// Documentation coverage (up to 3 points)
|
||||||
score += Math.min(3, repo.totalSnippets / 500);
|
score += Math.min(3, repo.totalSnippets / 500);
|
||||||
|
|
||||||
// Source type (1 point for GitHub, 0 for local)
|
// Source type (1 point for GitHub, 0 for local)
|
||||||
if (repo.source === 'github') score += 1;
|
if (repo.source === 'github') score += 1;
|
||||||
|
|
||||||
// Successful indexing (1 point)
|
// Successful indexing (1 point)
|
||||||
if (repo.state === 'indexed') score += 1;
|
if (repo.state === 'indexed') score += 1;
|
||||||
|
|
||||||
// Has description (1 point)
|
// Has description (1 point)
|
||||||
if (repo.description) score += 1;
|
if (repo.description) score += 1;
|
||||||
|
|
||||||
return Math.min(10, parseFloat(score.toFixed(1)));
|
return Math.min(10, parseFloat(score.toFixed(1)));
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -34,18 +34,18 @@ Implement a pluggable embedding generation system that produces vector represent
|
|||||||
// src/lib/server/embeddings/provider.ts
|
// src/lib/server/embeddings/provider.ts
|
||||||
|
|
||||||
export interface EmbeddingVector {
|
export interface EmbeddingVector {
|
||||||
values: Float32Array;
|
values: Float32Array;
|
||||||
dimensions: number;
|
dimensions: number;
|
||||||
model: string;
|
model: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface EmbeddingProvider {
|
export interface EmbeddingProvider {
|
||||||
readonly name: string;
|
readonly name: string;
|
||||||
readonly dimensions: number;
|
readonly dimensions: number;
|
||||||
readonly model: string;
|
readonly model: string;
|
||||||
|
|
||||||
embed(texts: string[]): Promise<EmbeddingVector[]>;
|
embed(texts: string[]): Promise<EmbeddingVector[]>;
|
||||||
isAvailable(): Promise<boolean>;
|
isAvailable(): Promise<boolean>;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -55,51 +55,51 @@ export interface EmbeddingProvider {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface OpenAIProviderConfig {
|
export interface OpenAIProviderConfig {
|
||||||
baseUrl: string; // e.g. "https://api.openai.com/v1" or "http://localhost:11434/v1"
|
baseUrl: string; // e.g. "https://api.openai.com/v1" or "http://localhost:11434/v1"
|
||||||
apiKey: string;
|
apiKey: string;
|
||||||
model: string; // e.g. "text-embedding-3-small", "nomic-embed-text"
|
model: string; // e.g. "text-embedding-3-small", "nomic-embed-text"
|
||||||
dimensions?: number; // override for models that support it (e.g. text-embedding-3-small)
|
dimensions?: number; // override for models that support it (e.g. text-embedding-3-small)
|
||||||
maxBatchSize?: number; // default: 100
|
maxBatchSize?: number; // default: 100
|
||||||
}
|
}
|
||||||
|
|
||||||
export class OpenAIEmbeddingProvider implements EmbeddingProvider {
|
export class OpenAIEmbeddingProvider implements EmbeddingProvider {
|
||||||
constructor(private config: OpenAIProviderConfig) {}
|
constructor(private config: OpenAIProviderConfig) {}
|
||||||
|
|
||||||
async embed(texts: string[]): Promise<EmbeddingVector[]> {
|
async embed(texts: string[]): Promise<EmbeddingVector[]> {
|
||||||
// Batch into groups of maxBatchSize
|
// Batch into groups of maxBatchSize
|
||||||
const batches = chunk(texts, this.config.maxBatchSize ?? 100);
|
const batches = chunk(texts, this.config.maxBatchSize ?? 100);
|
||||||
const allEmbeddings: EmbeddingVector[] = [];
|
const allEmbeddings: EmbeddingVector[] = [];
|
||||||
|
|
||||||
for (const batch of batches) {
|
for (const batch of batches) {
|
||||||
const response = await fetch(`${this.config.baseUrl}/embeddings`, {
|
const response = await fetch(`${this.config.baseUrl}/embeddings`, {
|
||||||
method: 'POST',
|
method: 'POST',
|
||||||
headers: {
|
headers: {
|
||||||
'Authorization': `Bearer ${this.config.apiKey}`,
|
Authorization: `Bearer ${this.config.apiKey}`,
|
||||||
'Content-Type': 'application/json',
|
'Content-Type': 'application/json'
|
||||||
},
|
},
|
||||||
body: JSON.stringify({
|
body: JSON.stringify({
|
||||||
model: this.config.model,
|
model: this.config.model,
|
||||||
input: batch,
|
input: batch,
|
||||||
dimensions: this.config.dimensions,
|
dimensions: this.config.dimensions
|
||||||
}),
|
})
|
||||||
});
|
});
|
||||||
|
|
||||||
if (!response.ok) {
|
if (!response.ok) {
|
||||||
throw new EmbeddingError(`API error: ${response.status}`);
|
throw new EmbeddingError(`API error: ${response.status}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const data = await response.json();
|
const data = await response.json();
|
||||||
for (const item of data.data) {
|
for (const item of data.data) {
|
||||||
allEmbeddings.push({
|
allEmbeddings.push({
|
||||||
values: new Float32Array(item.embedding),
|
values: new Float32Array(item.embedding),
|
||||||
dimensions: item.embedding.length,
|
dimensions: item.embedding.length,
|
||||||
model: this.config.model,
|
model: this.config.model
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return allEmbeddings;
|
return allEmbeddings;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -110,41 +110,41 @@ export class OpenAIEmbeddingProvider implements EmbeddingProvider {
|
|||||||
```typescript
|
```typescript
|
||||||
// Uses @xenova/transformers — only loaded if installed
|
// Uses @xenova/transformers — only loaded if installed
|
||||||
export class LocalEmbeddingProvider implements EmbeddingProvider {
|
export class LocalEmbeddingProvider implements EmbeddingProvider {
|
||||||
private pipeline: unknown = null;
|
private pipeline: unknown = null;
|
||||||
|
|
||||||
readonly name = 'local';
|
readonly name = 'local';
|
||||||
readonly model = 'Xenova/all-MiniLM-L6-v2'; // 384-dim, fast, small
|
readonly model = 'Xenova/all-MiniLM-L6-v2'; // 384-dim, fast, small
|
||||||
readonly dimensions = 384;
|
readonly dimensions = 384;
|
||||||
|
|
||||||
async embed(texts: string[]): Promise<EmbeddingVector[]> {
|
async embed(texts: string[]): Promise<EmbeddingVector[]> {
|
||||||
if (!this.pipeline) {
|
if (!this.pipeline) {
|
||||||
const { pipeline } = await import('@xenova/transformers');
|
const { pipeline } = await import('@xenova/transformers');
|
||||||
this.pipeline = await pipeline('feature-extraction', this.model);
|
this.pipeline = await pipeline('feature-extraction', this.model);
|
||||||
}
|
}
|
||||||
|
|
||||||
const results: EmbeddingVector[] = [];
|
const results: EmbeddingVector[] = [];
|
||||||
for (const text of texts) {
|
for (const text of texts) {
|
||||||
const output = await (this.pipeline as Function)(text, {
|
const output = await (this.pipeline as Function)(text, {
|
||||||
pooling: 'mean',
|
pooling: 'mean',
|
||||||
normalize: true,
|
normalize: true
|
||||||
});
|
});
|
||||||
results.push({
|
results.push({
|
||||||
values: new Float32Array(output.data),
|
values: new Float32Array(output.data),
|
||||||
dimensions: this.dimensions,
|
dimensions: this.dimensions,
|
||||||
model: this.model,
|
model: this.model
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
return results;
|
return results;
|
||||||
}
|
}
|
||||||
|
|
||||||
async isAvailable(): Promise<boolean> {
|
async isAvailable(): Promise<boolean> {
|
||||||
try {
|
try {
|
||||||
await import('@xenova/transformers');
|
await import('@xenova/transformers');
|
||||||
return true;
|
return true;
|
||||||
} catch {
|
} catch {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -154,53 +154,55 @@ export class LocalEmbeddingProvider implements EmbeddingProvider {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export class EmbeddingService {
|
export class EmbeddingService {
|
||||||
constructor(
|
constructor(
|
||||||
private db: BetterSQLite3.Database,
|
private db: BetterSQLite3.Database,
|
||||||
private provider: EmbeddingProvider
|
private provider: EmbeddingProvider
|
||||||
) {}
|
) {}
|
||||||
|
|
||||||
async embedSnippets(
|
async embedSnippets(
|
||||||
snippetIds: string[],
|
snippetIds: string[],
|
||||||
onProgress?: (done: number, total: number) => void
|
onProgress?: (done: number, total: number) => void
|
||||||
): Promise<void> {
|
): Promise<void> {
|
||||||
const snippets = this.db.prepare(
|
const snippets = this.db
|
||||||
`SELECT id, content, type FROM snippets WHERE id IN (${snippetIds.map(() => '?').join(',')})`
|
.prepare(
|
||||||
).all(...snippetIds) as Snippet[];
|
`SELECT id, content, type FROM snippets WHERE id IN (${snippetIds.map(() => '?').join(',')})`
|
||||||
|
)
|
||||||
|
.all(...snippetIds) as Snippet[];
|
||||||
|
|
||||||
// Prepare text for embedding: combine title + content
|
// Prepare text for embedding: combine title + content
|
||||||
const texts = snippets.map(s =>
|
const texts = snippets.map((s) =>
|
||||||
[s.title, s.breadcrumb, s.content].filter(Boolean).join('\n').slice(0, 2048)
|
[s.title, s.breadcrumb, s.content].filter(Boolean).join('\n').slice(0, 2048)
|
||||||
);
|
);
|
||||||
|
|
||||||
const BATCH_SIZE = 50;
|
const BATCH_SIZE = 50;
|
||||||
const insert = this.db.prepare(`
|
const insert = this.db.prepare(`
|
||||||
INSERT OR REPLACE INTO snippet_embeddings (snippet_id, model, dimensions, embedding, created_at)
|
INSERT OR REPLACE INTO snippet_embeddings (snippet_id, model, dimensions, embedding, created_at)
|
||||||
VALUES (?, ?, ?, ?, unixepoch())
|
VALUES (?, ?, ?, ?, unixepoch())
|
||||||
`);
|
`);
|
||||||
|
|
||||||
for (let i = 0; i < snippets.length; i += BATCH_SIZE) {
|
for (let i = 0; i < snippets.length; i += BATCH_SIZE) {
|
||||||
const batch = snippets.slice(i, i + BATCH_SIZE);
|
const batch = snippets.slice(i, i + BATCH_SIZE);
|
||||||
const batchTexts = texts.slice(i, i + BATCH_SIZE);
|
const batchTexts = texts.slice(i, i + BATCH_SIZE);
|
||||||
|
|
||||||
const embeddings = await this.provider.embed(batchTexts);
|
const embeddings = await this.provider.embed(batchTexts);
|
||||||
|
|
||||||
const insertMany = this.db.transaction(() => {
|
const insertMany = this.db.transaction(() => {
|
||||||
for (let j = 0; j < batch.length; j++) {
|
for (let j = 0; j < batch.length; j++) {
|
||||||
const snippet = batch[j];
|
const snippet = batch[j];
|
||||||
const embedding = embeddings[j];
|
const embedding = embeddings[j];
|
||||||
insert.run(
|
insert.run(
|
||||||
snippet.id,
|
snippet.id,
|
||||||
embedding.model,
|
embedding.model,
|
||||||
embedding.dimensions,
|
embedding.dimensions,
|
||||||
Buffer.from(embedding.values.buffer)
|
Buffer.from(embedding.values.buffer)
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
insertMany();
|
insertMany();
|
||||||
|
|
||||||
onProgress?.(Math.min(i + BATCH_SIZE, snippets.length), snippets.length);
|
onProgress?.(Math.min(i + BATCH_SIZE, snippets.length), snippets.length);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -212,13 +214,13 @@ Stored in the `settings` table as JSON:
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface EmbeddingConfig {
|
export interface EmbeddingConfig {
|
||||||
provider: 'openai' | 'local' | 'none';
|
provider: 'openai' | 'local' | 'none';
|
||||||
openai?: {
|
openai?: {
|
||||||
baseUrl: string;
|
baseUrl: string;
|
||||||
apiKey: string;
|
apiKey: string;
|
||||||
model: string;
|
model: string;
|
||||||
dimensions?: number;
|
dimensions?: number;
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
// Settings key: 'embedding_config'
|
// Settings key: 'embedding_config'
|
||||||
@@ -227,14 +229,15 @@ export interface EmbeddingConfig {
|
|||||||
### API Endpoints
|
### API Endpoints
|
||||||
|
|
||||||
`GET /api/v1/settings/embedding`
|
`GET /api/v1/settings/embedding`
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"provider": "openai",
|
"provider": "openai",
|
||||||
"openai": {
|
"openai": {
|
||||||
"baseUrl": "https://api.openai.com/v1",
|
"baseUrl": "https://api.openai.com/v1",
|
||||||
"model": "text-embedding-3-small",
|
"model": "text-embedding-3-small",
|
||||||
"dimensions": 1536
|
"dimensions": 1536
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -251,11 +254,7 @@ Embeddings are stored as raw `Float32Array` binary blobs:
|
|||||||
const buffer = Buffer.from(float32Array.buffer);
|
const buffer = Buffer.from(float32Array.buffer);
|
||||||
|
|
||||||
// Retrieve
|
// Retrieve
|
||||||
const float32Array = new Float32Array(
|
const float32Array = new Float32Array(buffer.buffer, buffer.byteOffset, buffer.byteLength / 4);
|
||||||
buffer.buffer,
|
|
||||||
buffer.byteOffset,
|
|
||||||
buffer.byteLength / 4
|
|
||||||
);
|
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|||||||
@@ -102,21 +102,21 @@ async vectorSearch(
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function reciprocalRankFusion(
|
function reciprocalRankFusion(
|
||||||
...rankings: Array<Array<{ id: string; score: number }>>
|
...rankings: Array<Array<{ id: string; score: number }>>
|
||||||
): Array<{ id: string; rrfScore: number }> {
|
): Array<{ id: string; rrfScore: number }> {
|
||||||
const K = 60; // RRF constant (standard value)
|
const K = 60; // RRF constant (standard value)
|
||||||
const scores = new Map<string, number>();
|
const scores = new Map<string, number>();
|
||||||
|
|
||||||
for (const ranking of rankings) {
|
for (const ranking of rankings) {
|
||||||
ranking.forEach(({ id }, rank) => {
|
ranking.forEach(({ id }, rank) => {
|
||||||
const current = scores.get(id) ?? 0;
|
const current = scores.get(id) ?? 0;
|
||||||
scores.set(id, current + 1 / (K + rank + 1));
|
scores.set(id, current + 1 / (K + rank + 1));
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
return Array.from(scores.entries())
|
return Array.from(scores.entries())
|
||||||
.map(([id, rrfScore]) => ({ id, rrfScore }))
|
.map(([id, rrfScore]) => ({ id, rrfScore }))
|
||||||
.sort((a, b) => b.rrfScore - a.rrfScore);
|
.sort((a, b) => b.rrfScore - a.rrfScore);
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -126,65 +126,62 @@ function reciprocalRankFusion(
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export interface HybridSearchOptions {
|
export interface HybridSearchOptions {
|
||||||
repositoryId: string;
|
repositoryId: string;
|
||||||
versionId?: string;
|
versionId?: string;
|
||||||
type?: 'code' | 'info';
|
type?: 'code' | 'info';
|
||||||
limit?: number;
|
limit?: number;
|
||||||
alpha?: number; // 0.0 = FTS5 only, 1.0 = vector only, 0.5 = balanced
|
alpha?: number; // 0.0 = FTS5 only, 1.0 = vector only, 0.5 = balanced
|
||||||
}
|
}
|
||||||
|
|
||||||
export class HybridSearchService {
|
export class HybridSearchService {
|
||||||
constructor(
|
constructor(
|
||||||
private db: BetterSQLite3.Database,
|
private db: BetterSQLite3.Database,
|
||||||
private searchService: SearchService,
|
private searchService: SearchService,
|
||||||
private embeddingProvider: EmbeddingProvider | null,
|
private embeddingProvider: EmbeddingProvider | null
|
||||||
) {}
|
) {}
|
||||||
|
|
||||||
async search(
|
async search(query: string, options: HybridSearchOptions): Promise<SnippetSearchResult[]> {
|
||||||
query: string,
|
const limit = options.limit ?? 20;
|
||||||
options: HybridSearchOptions
|
const alpha = options.alpha ?? 0.5;
|
||||||
): Promise<SnippetSearchResult[]> {
|
|
||||||
const limit = options.limit ?? 20;
|
|
||||||
const alpha = options.alpha ?? 0.5;
|
|
||||||
|
|
||||||
// Always run FTS5 search
|
// Always run FTS5 search
|
||||||
const ftsResults = this.searchService.searchSnippets(query, {
|
const ftsResults = this.searchService.searchSnippets(query, {
|
||||||
repositoryId: options.repositoryId,
|
repositoryId: options.repositoryId,
|
||||||
versionId: options.versionId,
|
versionId: options.versionId,
|
||||||
type: options.type,
|
type: options.type,
|
||||||
limit: limit * 3, // get more candidates for fusion
|
limit: limit * 3 // get more candidates for fusion
|
||||||
});
|
});
|
||||||
|
|
||||||
// If no embedding provider or alpha = 0, return FTS5 results directly
|
// If no embedding provider or alpha = 0, return FTS5 results directly
|
||||||
if (!this.embeddingProvider || alpha === 0) {
|
if (!this.embeddingProvider || alpha === 0) {
|
||||||
return ftsResults.slice(0, limit);
|
return ftsResults.slice(0, limit);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Embed the query and run vector search
|
// Embed the query and run vector search
|
||||||
const [queryEmbedding] = await this.embeddingProvider.embed([query]);
|
const [queryEmbedding] = await this.embeddingProvider.embed([query]);
|
||||||
const vectorResults = await this.vectorSearch(
|
const vectorResults = await this.vectorSearch(
|
||||||
queryEmbedding.values,
|
queryEmbedding.values,
|
||||||
options.repositoryId,
|
options.repositoryId,
|
||||||
limit * 3
|
limit * 3
|
||||||
);
|
);
|
||||||
|
|
||||||
// Normalize result lists for RRF
|
// Normalize result lists for RRF
|
||||||
const ftsRanked = ftsResults.map((r, i) => ({
|
const ftsRanked = ftsResults.map((r, i) => ({
|
||||||
id: r.snippet.id,
|
id: r.snippet.id,
|
||||||
score: i,
|
score: i
|
||||||
}));
|
}));
|
||||||
const vecRanked = vectorResults.map((r, i) => ({
|
const vecRanked = vectorResults.map((r, i) => ({
|
||||||
id: r.snippetId,
|
id: r.snippetId,
|
||||||
score: i,
|
score: i
|
||||||
}));
|
}));
|
||||||
|
|
||||||
// Apply RRF
|
// Apply RRF
|
||||||
const fused = reciprocalRankFusion(ftsRanked, vecRanked);
|
const fused = reciprocalRankFusion(ftsRanked, vecRanked);
|
||||||
|
|
||||||
// Fetch full snippet data for top results
|
// Fetch full snippet data for top results
|
||||||
const topIds = fused.slice(0, limit).map(r => r.id);
|
const topIds = fused.slice(0, limit).map((r) => r.id);
|
||||||
return this.fetchSnippetsByIds(topIds, options.repositoryId);
|
return this.fetchSnippetsByIds(topIds, options.repositoryId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -197,9 +194,9 @@ The hybrid search alpha value can be set per-request or globally via settings:
|
|||||||
```typescript
|
```typescript
|
||||||
// Default config stored in settings table under key 'search_config'
|
// Default config stored in settings table under key 'search_config'
|
||||||
export interface SearchConfig {
|
export interface SearchConfig {
|
||||||
alpha: number; // 0.5 default
|
alpha: number; // 0.5 default
|
||||||
maxResults: number; // 20 default
|
maxResults: number; // 20 default
|
||||||
enableHybrid: boolean; // true if embedding provider is configured
|
enableHybrid: boolean; // true if embedding provider is configured
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -56,75 +56,83 @@ Implement the end-to-end indexing pipeline that orchestrates crawling, parsing,
|
|||||||
// src/lib/server/pipeline/job-queue.ts
|
// src/lib/server/pipeline/job-queue.ts
|
||||||
|
|
||||||
export class JobQueue {
|
export class JobQueue {
|
||||||
private isRunning = false;
|
private isRunning = false;
|
||||||
|
|
||||||
constructor(private db: BetterSQLite3.Database) {}
|
constructor(private db: BetterSQLite3.Database) {}
|
||||||
|
|
||||||
enqueue(repositoryId: string, versionId?: string): IndexingJob {
|
enqueue(repositoryId: string, versionId?: string): IndexingJob {
|
||||||
const job: NewIndexingJob = {
|
const job: NewIndexingJob = {
|
||||||
id: crypto.randomUUID(),
|
id: crypto.randomUUID(),
|
||||||
repositoryId,
|
repositoryId,
|
||||||
versionId: versionId ?? null,
|
versionId: versionId ?? null,
|
||||||
status: 'queued',
|
status: 'queued',
|
||||||
progress: 0,
|
progress: 0,
|
||||||
totalFiles: 0,
|
totalFiles: 0,
|
||||||
processedFiles: 0,
|
processedFiles: 0,
|
||||||
error: null,
|
error: null,
|
||||||
startedAt: null,
|
startedAt: null,
|
||||||
completedAt: null,
|
completedAt: null,
|
||||||
createdAt: new Date(),
|
createdAt: new Date()
|
||||||
};
|
};
|
||||||
|
|
||||||
this.db.prepare(`
|
this.db
|
||||||
|
.prepare(
|
||||||
|
`
|
||||||
INSERT INTO indexing_jobs VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
INSERT INTO indexing_jobs VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
`).run(...Object.values(job));
|
`
|
||||||
|
)
|
||||||
|
.run(...Object.values(job));
|
||||||
|
|
||||||
// Kick off processing if not already running
|
// Kick off processing if not already running
|
||||||
if (!this.isRunning) {
|
if (!this.isRunning) {
|
||||||
setImmediate(() => this.processNext());
|
setImmediate(() => this.processNext());
|
||||||
}
|
}
|
||||||
|
|
||||||
return job;
|
return job;
|
||||||
}
|
}
|
||||||
|
|
||||||
private async processNext(): Promise<void> {
|
private async processNext(): Promise<void> {
|
||||||
if (this.isRunning) return;
|
if (this.isRunning) return;
|
||||||
|
|
||||||
const job = this.db.prepare(`
|
const job = this.db
|
||||||
|
.prepare(
|
||||||
|
`
|
||||||
SELECT * FROM indexing_jobs
|
SELECT * FROM indexing_jobs
|
||||||
WHERE status = 'queued'
|
WHERE status = 'queued'
|
||||||
ORDER BY created_at ASC
|
ORDER BY created_at ASC
|
||||||
LIMIT 1
|
LIMIT 1
|
||||||
`).get() as IndexingJob | undefined;
|
`
|
||||||
|
)
|
||||||
|
.get() as IndexingJob | undefined;
|
||||||
|
|
||||||
if (!job) return;
|
if (!job) return;
|
||||||
|
|
||||||
this.isRunning = true;
|
this.isRunning = true;
|
||||||
try {
|
try {
|
||||||
await this.pipeline.run(job);
|
await this.pipeline.run(job);
|
||||||
} finally {
|
} finally {
|
||||||
this.isRunning = false;
|
this.isRunning = false;
|
||||||
// Check for next queued job
|
// Check for next queued job
|
||||||
const nextJob = this.db.prepare(
|
const nextJob = this.db
|
||||||
`SELECT id FROM indexing_jobs WHERE status = 'queued' LIMIT 1`
|
.prepare(`SELECT id FROM indexing_jobs WHERE status = 'queued' LIMIT 1`)
|
||||||
).get();
|
.get();
|
||||||
if (nextJob) setImmediate(() => this.processNext());
|
if (nextJob) setImmediate(() => this.processNext());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
getJob(id: string): IndexingJob | null {
|
getJob(id: string): IndexingJob | null {
|
||||||
return this.db.prepare(
|
return this.db
|
||||||
`SELECT * FROM indexing_jobs WHERE id = ?`
|
.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`)
|
||||||
).get(id) as IndexingJob | null;
|
.get(id) as IndexingJob | null;
|
||||||
}
|
}
|
||||||
|
|
||||||
listJobs(repositoryId?: string, limit = 20): IndexingJob[] {
|
listJobs(repositoryId?: string, limit = 20): IndexingJob[] {
|
||||||
const query = repositoryId
|
const query = repositoryId
|
||||||
? `SELECT * FROM indexing_jobs WHERE repository_id = ? ORDER BY created_at DESC LIMIT ?`
|
? `SELECT * FROM indexing_jobs WHERE repository_id = ? ORDER BY created_at DESC LIMIT ?`
|
||||||
: `SELECT * FROM indexing_jobs ORDER BY created_at DESC LIMIT ?`;
|
: `SELECT * FROM indexing_jobs ORDER BY created_at DESC LIMIT ?`;
|
||||||
const params = repositoryId ? [repositoryId, limit] : [limit];
|
const params = repositoryId ? [repositoryId, limit] : [limit];
|
||||||
return this.db.prepare(query).all(...params) as IndexingJob[];
|
return this.db.prepare(query).all(...params) as IndexingJob[];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -136,94 +144,96 @@ export class JobQueue {
|
|||||||
// src/lib/server/pipeline/indexing.pipeline.ts
|
// src/lib/server/pipeline/indexing.pipeline.ts
|
||||||
|
|
||||||
export class IndexingPipeline {
|
export class IndexingPipeline {
|
||||||
constructor(
|
constructor(
|
||||||
private db: BetterSQLite3.Database,
|
private db: BetterSQLite3.Database,
|
||||||
private githubCrawler: GitHubCrawler,
|
private githubCrawler: GitHubCrawler,
|
||||||
private localCrawler: LocalCrawler,
|
private localCrawler: LocalCrawler,
|
||||||
private embeddingService: EmbeddingService | null,
|
private embeddingService: EmbeddingService | null
|
||||||
) {}
|
) {}
|
||||||
|
|
||||||
async run(job: IndexingJob): Promise<void> {
|
async run(job: IndexingJob): Promise<void> {
|
||||||
this.updateJob(job.id, { status: 'running', startedAt: new Date() });
|
this.updateJob(job.id, { status: 'running', startedAt: new Date() });
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const repo = this.getRepository(job.repositoryId);
|
const repo = this.getRepository(job.repositoryId);
|
||||||
if (!repo) throw new Error(`Repository ${job.repositoryId} not found`);
|
if (!repo) throw new Error(`Repository ${job.repositoryId} not found`);
|
||||||
|
|
||||||
// Update repo state
|
// Update repo state
|
||||||
this.updateRepo(repo.id, { state: 'indexing' });
|
this.updateRepo(repo.id, { state: 'indexing' });
|
||||||
|
|
||||||
// Step 1: Crawl
|
// Step 1: Crawl
|
||||||
const crawlResult = await this.crawl(repo, job);
|
const crawlResult = await this.crawl(repo, job);
|
||||||
|
|
||||||
// Step 2: Parse and diff
|
// Step 2: Parse and diff
|
||||||
const { newSnippets, changedDocIds, newDocuments } =
|
const { newSnippets, changedDocIds, newDocuments } = await this.parseAndDiff(
|
||||||
await this.parseAndDiff(crawlResult, repo, job);
|
crawlResult,
|
||||||
|
repo,
|
||||||
|
job
|
||||||
|
);
|
||||||
|
|
||||||
// Step 3: Atomic replacement
|
// Step 3: Atomic replacement
|
||||||
this.replaceSnippets(repo.id, changedDocIds, newDocuments, newSnippets);
|
this.replaceSnippets(repo.id, changedDocIds, newDocuments, newSnippets);
|
||||||
|
|
||||||
// Step 4: Embeddings (async, non-blocking for job completion)
|
// Step 4: Embeddings (async, non-blocking for job completion)
|
||||||
if (this.embeddingService && newSnippets.length > 0) {
|
if (this.embeddingService && newSnippets.length > 0) {
|
||||||
await this.embeddingService.embedSnippets(
|
await this.embeddingService.embedSnippets(
|
||||||
newSnippets.map(s => s.id),
|
newSnippets.map((s) => s.id),
|
||||||
(done, total) => {
|
(done, total) => {
|
||||||
// Update job progress for embedding phase
|
// Update job progress for embedding phase
|
||||||
}
|
}
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Step 5: Update repo stats
|
// Step 5: Update repo stats
|
||||||
const stats = this.computeStats(repo.id);
|
const stats = this.computeStats(repo.id);
|
||||||
this.updateRepo(repo.id, {
|
this.updateRepo(repo.id, {
|
||||||
state: 'indexed',
|
state: 'indexed',
|
||||||
totalSnippets: stats.totalSnippets,
|
totalSnippets: stats.totalSnippets,
|
||||||
totalTokens: stats.totalTokens,
|
totalTokens: stats.totalTokens,
|
||||||
trustScore: computeTrustScore({ ...repo, ...stats }),
|
trustScore: computeTrustScore({ ...repo, ...stats }),
|
||||||
lastIndexedAt: new Date(),
|
lastIndexedAt: new Date()
|
||||||
});
|
});
|
||||||
|
|
||||||
this.updateJob(job.id, {
|
this.updateJob(job.id, {
|
||||||
status: 'done',
|
status: 'done',
|
||||||
progress: 100,
|
progress: 100,
|
||||||
completedAt: new Date(),
|
completedAt: new Date()
|
||||||
});
|
});
|
||||||
|
} catch (error) {
|
||||||
|
this.updateJob(job.id, {
|
||||||
|
status: 'failed',
|
||||||
|
error: (error as Error).message,
|
||||||
|
completedAt: new Date()
|
||||||
|
});
|
||||||
|
this.updateRepo(job.repositoryId, { state: 'error' });
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
} catch (error) {
|
private replaceSnippets(
|
||||||
this.updateJob(job.id, {
|
repositoryId: string,
|
||||||
status: 'failed',
|
changedDocIds: string[],
|
||||||
error: (error as Error).message,
|
newDocuments: NewDocument[],
|
||||||
completedAt: new Date(),
|
newSnippets: NewSnippet[]
|
||||||
});
|
): void {
|
||||||
this.updateRepo(job.repositoryId, { state: 'error' });
|
// Single transaction: delete old → insert new
|
||||||
throw error;
|
this.db.transaction(() => {
|
||||||
}
|
if (changedDocIds.length > 0) {
|
||||||
}
|
// Cascade deletes snippets via FK constraint
|
||||||
|
this.db
|
||||||
|
.prepare(`DELETE FROM documents WHERE id IN (${changedDocIds.map(() => '?').join(',')})`)
|
||||||
|
.run(...changedDocIds);
|
||||||
|
}
|
||||||
|
|
||||||
private replaceSnippets(
|
for (const doc of newDocuments) {
|
||||||
repositoryId: string,
|
this.insertDocument(doc);
|
||||||
changedDocIds: string[],
|
}
|
||||||
newDocuments: NewDocument[],
|
|
||||||
newSnippets: NewSnippet[]
|
|
||||||
): void {
|
|
||||||
// Single transaction: delete old → insert new
|
|
||||||
this.db.transaction(() => {
|
|
||||||
if (changedDocIds.length > 0) {
|
|
||||||
// Cascade deletes snippets via FK constraint
|
|
||||||
this.db.prepare(
|
|
||||||
`DELETE FROM documents WHERE id IN (${changedDocIds.map(() => '?').join(',')})`
|
|
||||||
).run(...changedDocIds);
|
|
||||||
}
|
|
||||||
|
|
||||||
for (const doc of newDocuments) {
|
for (const snippet of newSnippets) {
|
||||||
this.insertDocument(doc);
|
this.insertSnippet(snippet);
|
||||||
}
|
}
|
||||||
|
})();
|
||||||
for (const snippet of newSnippets) {
|
}
|
||||||
this.insertSnippet(snippet);
|
|
||||||
}
|
|
||||||
})();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -233,26 +243,24 @@ export class IndexingPipeline {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function calculateProgress(
|
function calculateProgress(
|
||||||
processedFiles: number,
|
processedFiles: number,
|
||||||
totalFiles: number,
|
totalFiles: number,
|
||||||
embeddingsDone: number,
|
embeddingsDone: number,
|
||||||
embeddingsTotal: number,
|
embeddingsTotal: number,
|
||||||
hasEmbeddings: boolean
|
hasEmbeddings: boolean
|
||||||
): number {
|
): number {
|
||||||
if (totalFiles === 0) return 0;
|
if (totalFiles === 0) return 0;
|
||||||
|
|
||||||
if (!hasEmbeddings) {
|
if (!hasEmbeddings) {
|
||||||
// Crawl + parse = 100%
|
// Crawl + parse = 100%
|
||||||
return Math.round((processedFiles / totalFiles) * 100);
|
return Math.round((processedFiles / totalFiles) * 100);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Crawl+parse = 80%, embeddings = 20%
|
// Crawl+parse = 80%, embeddings = 20%
|
||||||
const parseProgress = (processedFiles / totalFiles) * 80;
|
const parseProgress = (processedFiles / totalFiles) * 80;
|
||||||
const embedProgress = embeddingsTotal > 0
|
const embedProgress = embeddingsTotal > 0 ? (embeddingsDone / embeddingsTotal) * 20 : 0;
|
||||||
? (embeddingsDone / embeddingsTotal) * 20
|
|
||||||
: 0;
|
|
||||||
|
|
||||||
return Math.round(parseProgress + embedProgress);
|
return Math.round(parseProgress + embedProgress);
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -263,20 +271,21 @@ function calculateProgress(
|
|||||||
### `GET /api/v1/jobs/:id`
|
### `GET /api/v1/jobs/:id`
|
||||||
|
|
||||||
Response `200`:
|
Response `200`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"job": {
|
"job": {
|
||||||
"id": "uuid",
|
"id": "uuid",
|
||||||
"repositoryId": "/facebook/react",
|
"repositoryId": "/facebook/react",
|
||||||
"status": "running",
|
"status": "running",
|
||||||
"progress": 47,
|
"progress": 47,
|
||||||
"totalFiles": 342,
|
"totalFiles": 342,
|
||||||
"processedFiles": 162,
|
"processedFiles": 162,
|
||||||
"error": null,
|
"error": null,
|
||||||
"startedAt": "2026-03-22T10:00:00Z",
|
"startedAt": "2026-03-22T10:00:00Z",
|
||||||
"completedAt": null,
|
"completedAt": null,
|
||||||
"createdAt": "2026-03-22T09:59:55Z"
|
"createdAt": "2026-03-22T09:59:55Z"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -285,6 +294,7 @@ Response `200`:
|
|||||||
Query params: `repositoryId` (optional), `status` (optional), `limit` (default 20).
|
Query params: `repositoryId` (optional), `status` (optional), `limit` (default 20).
|
||||||
|
|
||||||
Response `200`:
|
Response `200`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"jobs": [...],
|
"jobs": [...],
|
||||||
@@ -300,20 +310,24 @@ On application start, mark any jobs in `running` state as `failed` (they were in
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function recoverStaleJobs(db: BetterSQLite3.Database): void {
|
function recoverStaleJobs(db: BetterSQLite3.Database): void {
|
||||||
db.prepare(`
|
db.prepare(
|
||||||
|
`
|
||||||
UPDATE indexing_jobs
|
UPDATE indexing_jobs
|
||||||
SET status = 'failed',
|
SET status = 'failed',
|
||||||
error = 'Server restarted while job was running',
|
error = 'Server restarted while job was running',
|
||||||
completed_at = unixepoch()
|
completed_at = unixepoch()
|
||||||
WHERE status = 'running'
|
WHERE status = 'running'
|
||||||
`).run();
|
`
|
||||||
|
).run();
|
||||||
|
|
||||||
// Also reset any repositories stuck in 'indexing' state
|
// Also reset any repositories stuck in 'indexing' state
|
||||||
db.prepare(`
|
db.prepare(
|
||||||
|
`
|
||||||
UPDATE repositories
|
UPDATE repositories
|
||||||
SET state = 'error'
|
SET state = 'error'
|
||||||
WHERE state = 'indexing'
|
WHERE state = 'indexing'
|
||||||
`).run();
|
`
|
||||||
|
).run();
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -32,33 +32,33 @@ Implement the public-facing REST API endpoints that replicate context7's `/api/v
|
|||||||
|
|
||||||
### Query Parameters
|
### Query Parameters
|
||||||
|
|
||||||
| Parameter | Type | Required | Description |
|
| Parameter | Type | Required | Description |
|
||||||
|-----------|------|----------|-------------|
|
| ------------- | ------- | -------- | ------------------------------------- |
|
||||||
| `libraryName` | string | Yes | Library name to search for |
|
| `libraryName` | string | Yes | Library name to search for |
|
||||||
| `query` | string | No | User's question for relevance ranking |
|
| `query` | string | No | User's question for relevance ranking |
|
||||||
| `limit` | integer | No | Max results (default: 10, max: 50) |
|
| `limit` | integer | No | Max results (default: 10, max: 50) |
|
||||||
|
|
||||||
### Response `200` (`type=json`, default):
|
### Response `200` (`type=json`, default):
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"results": [
|
"results": [
|
||||||
{
|
{
|
||||||
"id": "/facebook/react",
|
"id": "/facebook/react",
|
||||||
"title": "React",
|
"title": "React",
|
||||||
"description": "A JavaScript library for building user interfaces",
|
"description": "A JavaScript library for building user interfaces",
|
||||||
"branch": "main",
|
"branch": "main",
|
||||||
"lastUpdateDate": "2026-03-22T10:00:00Z",
|
"lastUpdateDate": "2026-03-22T10:00:00Z",
|
||||||
"state": "finalized",
|
"state": "finalized",
|
||||||
"totalTokens": 142000,
|
"totalTokens": 142000,
|
||||||
"totalSnippets": 1247,
|
"totalSnippets": 1247,
|
||||||
"stars": 228000,
|
"stars": 228000,
|
||||||
"trustScore": 9.2,
|
"trustScore": 9.2,
|
||||||
"benchmarkScore": 87,
|
"benchmarkScore": 87,
|
||||||
"versions": ["v18.3.0", "v17.0.2"],
|
"versions": ["v18.3.0", "v17.0.2"],
|
||||||
"source": "https://github.com/facebook/react"
|
"source": "https://github.com/facebook/react"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -67,11 +67,11 @@ Note: `state: "finalized"` maps from TrueRef's `state: "indexed"` for compatibil
|
|||||||
### State Mapping
|
### State Mapping
|
||||||
|
|
||||||
| TrueRef state | context7 state |
|
| TrueRef state | context7 state |
|
||||||
|---------------|---------------|
|
| ------------- | -------------- |
|
||||||
| `pending` | `initial` |
|
| `pending` | `initial` |
|
||||||
| `indexing` | `initial` |
|
| `indexing` | `initial` |
|
||||||
| `indexed` | `finalized` |
|
| `indexed` | `finalized` |
|
||||||
| `error` | `error` |
|
| `error` | `error` |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -81,43 +81,43 @@ Note: `state: "finalized"` maps from TrueRef's `state: "indexed"` for compatibil
|
|||||||
|
|
||||||
### Query Parameters
|
### Query Parameters
|
||||||
|
|
||||||
| Parameter | Type | Required | Description |
|
| Parameter | Type | Required | Description |
|
||||||
|-----------|------|----------|-------------|
|
| ----------- | ------- | -------- | --------------------------------------------------------------- |
|
||||||
| `libraryId` | string | Yes | Library ID, e.g. `/facebook/react` or `/facebook/react/v18.3.0` |
|
| `libraryId` | string | Yes | Library ID, e.g. `/facebook/react` or `/facebook/react/v18.3.0` |
|
||||||
| `query` | string | Yes | Specific question about the library |
|
| `query` | string | Yes | Specific question about the library |
|
||||||
| `type` | string | No | `json` (default) or `txt` (plain text for LLM injection) |
|
| `type` | string | No | `json` (default) or `txt` (plain text for LLM injection) |
|
||||||
| `tokens` | integer | No | Approximate max token count for response (default: 10000) |
|
| `tokens` | integer | No | Approximate max token count for response (default: 10000) |
|
||||||
|
|
||||||
### Response `200` (`type=json`):
|
### Response `200` (`type=json`):
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"snippets": [
|
"snippets": [
|
||||||
{
|
{
|
||||||
"type": "code",
|
"type": "code",
|
||||||
"title": "Basic Component",
|
"title": "Basic Component",
|
||||||
"description": "Getting Started > Components",
|
"description": "Getting Started > Components",
|
||||||
"language": "tsx",
|
"language": "tsx",
|
||||||
"codeList": [
|
"codeList": [
|
||||||
{
|
{
|
||||||
"language": "tsx",
|
"language": "tsx",
|
||||||
"code": "function MyComponent() {\n return <div>Hello</div>;\n}"
|
"code": "function MyComponent() {\n return <div>Hello</div>;\n}"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"id": "uuid",
|
"id": "uuid",
|
||||||
"tokenCount": 45,
|
"tokenCount": 45,
|
||||||
"pageTitle": "Getting Started"
|
"pageTitle": "Getting Started"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"type": "info",
|
"type": "info",
|
||||||
"text": "React components let you split the UI into independent...",
|
"text": "React components let you split the UI into independent...",
|
||||||
"breadcrumb": "Core Concepts > Components",
|
"breadcrumb": "Core Concepts > Components",
|
||||||
"pageId": "uuid",
|
"pageId": "uuid",
|
||||||
"tokenCount": 120
|
"tokenCount": 120
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"rules": ["Always use functional components", "..."],
|
"rules": ["Always use functional components", "..."],
|
||||||
"totalTokens": 2840
|
"totalTokens": 2840
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -125,7 +125,7 @@ Note: `state: "finalized"` maps from TrueRef's `state: "indexed"` for compatibil
|
|||||||
|
|
||||||
Plain text formatted for direct LLM context injection:
|
Plain text formatted for direct LLM context injection:
|
||||||
|
|
||||||
```
|
````
|
||||||
## Library Rules
|
## Library Rules
|
||||||
- Always use functional components
|
- Always use functional components
|
||||||
- Use hooks for state management
|
- Use hooks for state management
|
||||||
@@ -139,15 +139,17 @@ Plain text formatted for direct LLM context injection:
|
|||||||
function MyComponent() {
|
function MyComponent() {
|
||||||
return <div>Hello</div>;
|
return <div>Hello</div>;
|
||||||
}
|
}
|
||||||
```
|
````
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### React components let you split the UI...
|
### React components let you split the UI...
|
||||||
*Core Concepts > Components*
|
|
||||||
|
_Core Concepts > Components_
|
||||||
|
|
||||||
React components let you split the UI into independent, reusable pieces...
|
React components let you split the UI into independent, reusable pieces...
|
||||||
```
|
|
||||||
|
````
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -167,7 +169,7 @@ function parseLibraryId(libraryId: string): {
|
|||||||
version: match[3],
|
version: match[3],
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
```
|
````
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -176,20 +178,17 @@ function parseLibraryId(libraryId: string): {
|
|||||||
The `tokens` parameter limits the total response size. Snippets are added greedily until the budget is exhausted:
|
The `tokens` parameter limits the total response size. Snippets are added greedily until the budget is exhausted:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function selectSnippetsWithinBudget(
|
function selectSnippetsWithinBudget(snippets: Snippet[], maxTokens: number): Snippet[] {
|
||||||
snippets: Snippet[],
|
const selected: Snippet[] = [];
|
||||||
maxTokens: number
|
let usedTokens = 0;
|
||||||
): Snippet[] {
|
|
||||||
const selected: Snippet[] = [];
|
|
||||||
let usedTokens = 0;
|
|
||||||
|
|
||||||
for (const snippet of snippets) {
|
for (const snippet of snippets) {
|
||||||
if (usedTokens + (snippet.tokenCount ?? 0) > maxTokens) break;
|
if (usedTokens + (snippet.tokenCount ?? 0) > maxTokens) break;
|
||||||
selected.push(snippet);
|
selected.push(snippet);
|
||||||
usedTokens += snippet.tokenCount ?? 0;
|
usedTokens += snippet.tokenCount ?? 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
return selected;
|
return selected;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -215,6 +214,7 @@ Default token budget: 10,000 tokens (~7,500 words) — enough for ~20 medium sni
|
|||||||
## CORS Configuration
|
## CORS Configuration
|
||||||
|
|
||||||
All API routes include:
|
All API routes include:
|
||||||
|
|
||||||
```
|
```
|
||||||
Access-Control-Allow-Origin: *
|
Access-Control-Allow-Origin: *
|
||||||
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
|
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
|
||||||
|
|||||||
@@ -32,8 +32,8 @@ Implement a Model Context Protocol (MCP) server that exposes `resolve-library-id
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"@modelcontextprotocol/sdk": "^1.25.1",
|
"@modelcontextprotocol/sdk": "^1.25.1",
|
||||||
"zod": "^4.3.4"
|
"zod": "^4.3.4"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -46,189 +46,190 @@ Implement a Model Context Protocol (MCP) server that exposes `resolve-library-id
|
|||||||
|
|
||||||
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
|
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
|
||||||
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
||||||
import {
|
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
|
||||||
CallToolRequestSchema,
|
|
||||||
ListToolsRequestSchema,
|
|
||||||
} from '@modelcontextprotocol/sdk/types.js';
|
|
||||||
import { z } from 'zod';
|
import { z } from 'zod';
|
||||||
|
|
||||||
const API_BASE = process.env.TRUEREF_API_URL ?? 'http://localhost:5173';
|
const API_BASE = process.env.TRUEREF_API_URL ?? 'http://localhost:5173';
|
||||||
|
|
||||||
const server = new Server(
|
const server = new Server(
|
||||||
{
|
{
|
||||||
name: 'io.github.trueref/trueref',
|
name: 'io.github.trueref/trueref',
|
||||||
version: '1.0.0',
|
version: '1.0.0'
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
capabilities: { tools: {} },
|
capabilities: { tools: {} }
|
||||||
}
|
}
|
||||||
);
|
);
|
||||||
|
|
||||||
// Tool schemas — identical to context7 for drop-in compatibility
|
// Tool schemas — identical to context7 for drop-in compatibility
|
||||||
const ResolveLibraryIdSchema = z.object({
|
const ResolveLibraryIdSchema = z.object({
|
||||||
libraryName: z.string().describe(
|
libraryName: z
|
||||||
'Library name to search for and resolve to a TrueRef library ID'
|
.string()
|
||||||
),
|
.describe('Library name to search for and resolve to a TrueRef library ID'),
|
||||||
query: z.string().describe(
|
query: z.string().describe("The user's question or context to help rank results")
|
||||||
"The user's question or context to help rank results"
|
|
||||||
),
|
|
||||||
});
|
});
|
||||||
|
|
||||||
const QueryDocsSchema = z.object({
|
const QueryDocsSchema = z.object({
|
||||||
libraryId: z.string().describe(
|
libraryId: z
|
||||||
'The TrueRef library ID obtained from resolve-library-id, e.g. /facebook/react'
|
.string()
|
||||||
),
|
.describe('The TrueRef library ID obtained from resolve-library-id, e.g. /facebook/react'),
|
||||||
query: z.string().describe(
|
query: z
|
||||||
'Specific question about the library to retrieve relevant documentation'
|
.string()
|
||||||
),
|
.describe('Specific question about the library to retrieve relevant documentation'),
|
||||||
tokens: z.number().optional().describe(
|
tokens: z.number().optional().describe('Maximum token budget for the response (default: 10000)')
|
||||||
'Maximum token budget for the response (default: 10000)'
|
|
||||||
),
|
|
||||||
});
|
});
|
||||||
|
|
||||||
server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
||||||
tools: [
|
tools: [
|
||||||
{
|
{
|
||||||
name: 'resolve-library-id',
|
name: 'resolve-library-id',
|
||||||
description: [
|
description: [
|
||||||
'Searches TrueRef to find a library matching the given name.',
|
'Searches TrueRef to find a library matching the given name.',
|
||||||
'Returns a list of matching libraries with their IDs.',
|
'Returns a list of matching libraries with their IDs.',
|
||||||
'ALWAYS call this tool before query-docs to get the correct library ID.',
|
'ALWAYS call this tool before query-docs to get the correct library ID.',
|
||||||
'Call at most 3 times per user question.',
|
'Call at most 3 times per user question.'
|
||||||
].join(' '),
|
].join(' '),
|
||||||
inputSchema: {
|
inputSchema: {
|
||||||
type: 'object',
|
type: 'object',
|
||||||
properties: {
|
properties: {
|
||||||
libraryName: {
|
libraryName: {
|
||||||
type: 'string',
|
type: 'string',
|
||||||
description: 'Library name to search for',
|
description: 'Library name to search for'
|
||||||
},
|
},
|
||||||
query: {
|
query: {
|
||||||
type: 'string',
|
type: 'string',
|
||||||
description: "User's question for relevance ranking",
|
description: "User's question for relevance ranking"
|
||||||
},
|
}
|
||||||
},
|
},
|
||||||
required: ['libraryName', 'query'],
|
required: ['libraryName', 'query']
|
||||||
},
|
}
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
name: 'query-docs',
|
name: 'query-docs',
|
||||||
description: [
|
description: [
|
||||||
'Fetches documentation and code examples from TrueRef for a specific library.',
|
'Fetches documentation and code examples from TrueRef for a specific library.',
|
||||||
'Requires a library ID obtained from resolve-library-id.',
|
'Requires a library ID obtained from resolve-library-id.',
|
||||||
'Returns relevant snippets formatted for LLM consumption.',
|
'Returns relevant snippets formatted for LLM consumption.',
|
||||||
'Call at most 3 times per user question.',
|
'Call at most 3 times per user question.'
|
||||||
].join(' '),
|
].join(' '),
|
||||||
inputSchema: {
|
inputSchema: {
|
||||||
type: 'object',
|
type: 'object',
|
||||||
properties: {
|
properties: {
|
||||||
libraryId: {
|
libraryId: {
|
||||||
type: 'string',
|
type: 'string',
|
||||||
description: 'TrueRef library ID, e.g. /facebook/react',
|
description: 'TrueRef library ID, e.g. /facebook/react'
|
||||||
},
|
},
|
||||||
query: {
|
query: {
|
||||||
type: 'string',
|
type: 'string',
|
||||||
description: 'Specific question about the library',
|
description: 'Specific question about the library'
|
||||||
},
|
},
|
||||||
tokens: {
|
tokens: {
|
||||||
type: 'number',
|
type: 'number',
|
||||||
description: 'Max token budget (default: 10000)',
|
description: 'Max token budget (default: 10000)'
|
||||||
},
|
}
|
||||||
},
|
},
|
||||||
required: ['libraryId', 'query'],
|
required: ['libraryId', 'query']
|
||||||
},
|
}
|
||||||
},
|
}
|
||||||
],
|
]
|
||||||
}));
|
}));
|
||||||
|
|
||||||
server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
||||||
const { name, arguments: args } = request.params;
|
const { name, arguments: args } = request.params;
|
||||||
|
|
||||||
if (name === 'resolve-library-id') {
|
if (name === 'resolve-library-id') {
|
||||||
const { libraryName, query } = ResolveLibraryIdSchema.parse(args);
|
const { libraryName, query } = ResolveLibraryIdSchema.parse(args);
|
||||||
|
|
||||||
const url = new URL(`${API_BASE}/api/v1/libs/search`);
|
const url = new URL(`${API_BASE}/api/v1/libs/search`);
|
||||||
url.searchParams.set('libraryName', libraryName);
|
url.searchParams.set('libraryName', libraryName);
|
||||||
url.searchParams.set('query', query);
|
url.searchParams.set('query', query);
|
||||||
url.searchParams.set('type', 'txt');
|
url.searchParams.set('type', 'txt');
|
||||||
|
|
||||||
const response = await fetch(url.toString());
|
const response = await fetch(url.toString());
|
||||||
if (!response.ok) {
|
if (!response.ok) {
|
||||||
return {
|
return {
|
||||||
content: [{
|
content: [
|
||||||
type: 'text',
|
{
|
||||||
text: `Error searching libraries: ${response.status} ${response.statusText}`,
|
type: 'text',
|
||||||
}],
|
text: `Error searching libraries: ${response.status} ${response.statusText}`
|
||||||
isError: true,
|
}
|
||||||
};
|
],
|
||||||
}
|
isError: true
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
const text = await response.text();
|
const text = await response.text();
|
||||||
return {
|
return {
|
||||||
content: [{ type: 'text', text }],
|
content: [{ type: 'text', text }]
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
if (name === 'query-docs') {
|
if (name === 'query-docs') {
|
||||||
const { libraryId, query, tokens } = QueryDocsSchema.parse(args);
|
const { libraryId, query, tokens } = QueryDocsSchema.parse(args);
|
||||||
|
|
||||||
const url = new URL(`${API_BASE}/api/v1/context`);
|
const url = new URL(`${API_BASE}/api/v1/context`);
|
||||||
url.searchParams.set('libraryId', libraryId);
|
url.searchParams.set('libraryId', libraryId);
|
||||||
url.searchParams.set('query', query);
|
url.searchParams.set('query', query);
|
||||||
url.searchParams.set('type', 'txt');
|
url.searchParams.set('type', 'txt');
|
||||||
if (tokens) url.searchParams.set('tokens', String(tokens));
|
if (tokens) url.searchParams.set('tokens', String(tokens));
|
||||||
|
|
||||||
const response = await fetch(url.toString());
|
const response = await fetch(url.toString());
|
||||||
if (!response.ok) {
|
if (!response.ok) {
|
||||||
const status = response.status;
|
const status = response.status;
|
||||||
if (status === 404) {
|
if (status === 404) {
|
||||||
return {
|
return {
|
||||||
content: [{
|
content: [
|
||||||
type: 'text',
|
{
|
||||||
text: `Library "${libraryId}" not found. Please run resolve-library-id first.`,
|
type: 'text',
|
||||||
}],
|
text: `Library "${libraryId}" not found. Please run resolve-library-id first.`
|
||||||
isError: true,
|
}
|
||||||
};
|
],
|
||||||
}
|
isError: true
|
||||||
if (status === 503) {
|
};
|
||||||
return {
|
}
|
||||||
content: [{
|
if (status === 503) {
|
||||||
type: 'text',
|
return {
|
||||||
text: `Library "${libraryId}" is currently being indexed. Please try again in a moment.`,
|
content: [
|
||||||
}],
|
{
|
||||||
isError: true,
|
type: 'text',
|
||||||
};
|
text: `Library "${libraryId}" is currently being indexed. Please try again in a moment.`
|
||||||
}
|
}
|
||||||
return {
|
],
|
||||||
content: [{
|
isError: true
|
||||||
type: 'text',
|
};
|
||||||
text: `Error fetching documentation: ${response.status} ${response.statusText}`,
|
}
|
||||||
}],
|
return {
|
||||||
isError: true,
|
content: [
|
||||||
};
|
{
|
||||||
}
|
type: 'text',
|
||||||
|
text: `Error fetching documentation: ${response.status} ${response.statusText}`
|
||||||
|
}
|
||||||
|
],
|
||||||
|
isError: true
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
const text = await response.text();
|
const text = await response.text();
|
||||||
return {
|
return {
|
||||||
content: [{ type: 'text', text }],
|
content: [{ type: 'text', text }]
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
content: [{ type: 'text', text: `Unknown tool: ${name}` }],
|
content: [{ type: 'text', text: `Unknown tool: ${name}` }],
|
||||||
isError: true,
|
isError: true
|
||||||
};
|
};
|
||||||
});
|
});
|
||||||
|
|
||||||
async function main() {
|
async function main() {
|
||||||
const transport = new StdioServerTransport();
|
const transport = new StdioServerTransport();
|
||||||
await server.connect(transport);
|
await server.connect(transport);
|
||||||
// Server runs until process exits
|
// Server runs until process exits
|
||||||
}
|
}
|
||||||
|
|
||||||
main().catch((err) => {
|
main().catch((err) => {
|
||||||
process.stderr.write(`MCP server error: ${err.message}\n`);
|
process.stderr.write(`MCP server error: ${err.message}\n`);
|
||||||
process.exit(1);
|
process.exit(1);
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -238,18 +239,19 @@ main().catch((err) => {
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"mcp:start": "node --experimental-vm-modules src/mcp/index.ts"
|
"mcp:start": "node --experimental-vm-modules src/mcp/index.ts"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Or with `tsx` for TypeScript-direct execution:
|
Or with `tsx` for TypeScript-direct execution:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"mcp:start": "tsx src/mcp/index.ts"
|
"mcp:start": "tsx src/mcp/index.ts"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -261,30 +263,31 @@ Users add to `.mcp.json`:
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"trueref": {
|
"trueref": {
|
||||||
"command": "node",
|
"command": "node",
|
||||||
"args": ["/path/to/trueref/dist/mcp/index.js"],
|
"args": ["/path/to/trueref/dist/mcp/index.js"],
|
||||||
"env": {
|
"env": {
|
||||||
"TRUEREF_API_URL": "http://localhost:5173"
|
"TRUEREF_API_URL": "http://localhost:5173"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Or with tsx for development:
|
Or with tsx for development:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"trueref": {
|
"trueref": {
|
||||||
"command": "npx",
|
"command": "npx",
|
||||||
"args": ["tsx", "/path/to/trueref/src/mcp/index.ts"],
|
"args": ["tsx", "/path/to/trueref/src/mcp/index.ts"],
|
||||||
"env": {
|
"env": {
|
||||||
"TRUEREF_API_URL": "http://localhost:5173"
|
"TRUEREF_API_URL": "http://localhost:5173"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -295,13 +298,15 @@ Or with tsx for development:
|
|||||||
The MCP server should include a `resources` list item (optional) or the library responses themselves prepend rules. Additionally, users should add a Claude rule file:
|
The MCP server should include a `resources` list item (optional) or the library responses themselves prepend rules. Additionally, users should add a Claude rule file:
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
<!-- .claude/rules/trueref.md -->
|
## <!-- .claude/rules/trueref.md -->
|
||||||
---
|
|
||||||
description: Use TrueRef to retrieve documentation for indexed libraries
|
description: Use TrueRef to retrieve documentation for indexed libraries
|
||||||
alwaysApply: true
|
alwaysApply: true
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
When answering questions about indexed libraries, always use the TrueRef MCP tools:
|
When answering questions about indexed libraries, always use the TrueRef MCP tools:
|
||||||
|
|
||||||
1. Call `resolve-library-id` with the library name and the user's question to get the library ID
|
1. Call `resolve-library-id` with the library name and the user's question to get the library ID
|
||||||
2. Call `query-docs` with the library ID and question to retrieve relevant documentation
|
2. Call `query-docs` with the library ID and question to retrieve relevant documentation
|
||||||
3. Use the returned documentation to answer the question accurately
|
3. Use the returned documentation to answer the question accurately
|
||||||
|
|||||||
@@ -50,64 +50,64 @@ import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/
|
|||||||
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
||||||
|
|
||||||
const { values: args } = parseArgs({
|
const { values: args } = parseArgs({
|
||||||
options: {
|
options: {
|
||||||
transport: { type: 'string', default: 'stdio' },
|
transport: { type: 'string', default: 'stdio' },
|
||||||
port: { type: 'string', default: process.env.PORT ?? '3001' },
|
port: { type: 'string', default: process.env.PORT ?? '3001' }
|
||||||
},
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
async function startHttp(server: Server, port: number): Promise<void> {
|
async function startHttp(server: Server, port: number): Promise<void> {
|
||||||
const httpServer = createServer(async (req, res) => {
|
const httpServer = createServer(async (req, res) => {
|
||||||
const url = new URL(req.url!, `http://localhost:${port}`);
|
const url = new URL(req.url!, `http://localhost:${port}`);
|
||||||
|
|
||||||
// Health check
|
// Health check
|
||||||
if (url.pathname === '/ping') {
|
if (url.pathname === '/ping') {
|
||||||
res.writeHead(200, { 'Content-Type': 'application/json' });
|
res.writeHead(200, { 'Content-Type': 'application/json' });
|
||||||
res.end(JSON.stringify({ ok: true }));
|
res.end(JSON.stringify({ ok: true }));
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
// MCP endpoint
|
// MCP endpoint
|
||||||
if (url.pathname === '/mcp') {
|
if (url.pathname === '/mcp') {
|
||||||
// CORS preflight
|
// CORS preflight
|
||||||
res.setHeader('Access-Control-Allow-Origin', '*');
|
res.setHeader('Access-Control-Allow-Origin', '*');
|
||||||
res.setHeader('Access-Control-Allow-Methods', 'POST, GET, OPTIONS');
|
res.setHeader('Access-Control-Allow-Methods', 'POST, GET, OPTIONS');
|
||||||
res.setHeader('Access-Control-Allow-Headers', 'Content-Type, Accept');
|
res.setHeader('Access-Control-Allow-Headers', 'Content-Type, Accept');
|
||||||
|
|
||||||
if (req.method === 'OPTIONS') {
|
if (req.method === 'OPTIONS') {
|
||||||
res.writeHead(204);
|
res.writeHead(204);
|
||||||
res.end();
|
res.end();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
const transport = new StreamableHTTPServerTransport({
|
const transport = new StreamableHTTPServerTransport({
|
||||||
sessionIdGenerator: () => crypto.randomUUID(),
|
sessionIdGenerator: () => crypto.randomUUID()
|
||||||
});
|
});
|
||||||
|
|
||||||
await server.connect(transport);
|
await server.connect(transport);
|
||||||
await transport.handleRequest(req, res);
|
await transport.handleRequest(req, res);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
res.writeHead(404);
|
res.writeHead(404);
|
||||||
res.end('Not Found');
|
res.end('Not Found');
|
||||||
});
|
});
|
||||||
|
|
||||||
httpServer.listen(port, () => {
|
httpServer.listen(port, () => {
|
||||||
process.stderr.write(`TrueRef MCP server listening on http://localhost:${port}/mcp\n`);
|
process.stderr.write(`TrueRef MCP server listening on http://localhost:${port}/mcp\n`);
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
async function main() {
|
async function main() {
|
||||||
const mcpServer = createMcpServer(); // shared server creation
|
const mcpServer = createMcpServer(); // shared server creation
|
||||||
|
|
||||||
if (args.transport === 'http') {
|
if (args.transport === 'http') {
|
||||||
const port = parseInt(args.port!, 10);
|
const port = parseInt(args.port!, 10);
|
||||||
await startHttp(mcpServer, port);
|
await startHttp(mcpServer, port);
|
||||||
} else {
|
} else {
|
||||||
const transport = new StdioServerTransport();
|
const transport = new StdioServerTransport();
|
||||||
await mcpServer.connect(transport);
|
await mcpServer.connect(transport);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -117,10 +117,10 @@ async function main() {
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"mcp:start": "tsx src/mcp/index.ts",
|
"mcp:start": "tsx src/mcp/index.ts",
|
||||||
"mcp:http": "tsx src/mcp/index.ts --transport http --port 3001"
|
"mcp:http": "tsx src/mcp/index.ts --transport http --port 3001"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -132,12 +132,12 @@ For HTTP transport, users configure Claude Code with the remote URL:
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"trueref": {
|
"trueref": {
|
||||||
"type": "http",
|
"type": "http",
|
||||||
"url": "http://localhost:3001/mcp"
|
"url": "http://localhost:3001/mcp"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -32,53 +32,53 @@ Support `trueref.json` configuration files placed in the root of a repository. T
|
|||||||
// src/lib/server/config/trueref-config.schema.ts
|
// src/lib/server/config/trueref-config.schema.ts
|
||||||
|
|
||||||
export interface TrueRefConfig {
|
export interface TrueRefConfig {
|
||||||
/**
|
/**
|
||||||
* Override the display name for this library.
|
* Override the display name for this library.
|
||||||
* 1–100 characters.
|
* 1–100 characters.
|
||||||
*/
|
*/
|
||||||
projectTitle?: string;
|
projectTitle?: string;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Description of the library for search ranking.
|
* Description of the library for search ranking.
|
||||||
* 10–500 characters.
|
* 10–500 characters.
|
||||||
*/
|
*/
|
||||||
description?: string;
|
description?: string;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Folders to include in indexing (allowlist).
|
* Folders to include in indexing (allowlist).
|
||||||
* Each entry is a path prefix or regex string.
|
* Each entry is a path prefix or regex string.
|
||||||
* If empty/absent, all folders are included.
|
* If empty/absent, all folders are included.
|
||||||
* Examples: ["src/", "docs/", "^packages/core"]
|
* Examples: ["src/", "docs/", "^packages/core"]
|
||||||
*/
|
*/
|
||||||
folders?: string[];
|
folders?: string[];
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Folders to exclude from indexing.
|
* Folders to exclude from indexing.
|
||||||
* Applied after `folders` allowlist.
|
* Applied after `folders` allowlist.
|
||||||
* Examples: ["test/", "fixtures/", "__mocks__"]
|
* Examples: ["test/", "fixtures/", "__mocks__"]
|
||||||
*/
|
*/
|
||||||
excludeFolders?: string[];
|
excludeFolders?: string[];
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Exact filenames to exclude (no path, no regex).
|
* Exact filenames to exclude (no path, no regex).
|
||||||
* Examples: ["README.md", "CHANGELOG.md", "jest.config.ts"]
|
* Examples: ["README.md", "CHANGELOG.md", "jest.config.ts"]
|
||||||
*/
|
*/
|
||||||
excludeFiles?: string[];
|
excludeFiles?: string[];
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Best practices / rules to inject at the top of every query-docs response.
|
* Best practices / rules to inject at the top of every query-docs response.
|
||||||
* Each rule: 5–500 characters.
|
* Each rule: 5–500 characters.
|
||||||
* Maximum 20 rules.
|
* Maximum 20 rules.
|
||||||
*/
|
*/
|
||||||
rules?: string[];
|
rules?: string[];
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Previously released versions to make available for versioned queries.
|
* Previously released versions to make available for versioned queries.
|
||||||
*/
|
*/
|
||||||
previousVersions?: Array<{
|
previousVersions?: Array<{
|
||||||
tag: string; // git tag (e.g. "v1.2.3")
|
tag: string; // git tag (e.g. "v1.2.3")
|
||||||
title: string; // human-readable (e.g. "Version 1.2.3")
|
title: string; // human-readable (e.g. "Version 1.2.3")
|
||||||
}>;
|
}>;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -88,14 +88,14 @@ export interface TrueRefConfig {
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const CONFIG_CONSTRAINTS = {
|
const CONFIG_CONSTRAINTS = {
|
||||||
projectTitle: { minLength: 1, maxLength: 100 },
|
projectTitle: { minLength: 1, maxLength: 100 },
|
||||||
description: { minLength: 10, maxLength: 500 },
|
description: { minLength: 10, maxLength: 500 },
|
||||||
folders: { maxItems: 50, maxLength: 200 }, // per entry
|
folders: { maxItems: 50, maxLength: 200 }, // per entry
|
||||||
excludeFolders: { maxItems: 50, maxLength: 200 },
|
excludeFolders: { maxItems: 50, maxLength: 200 },
|
||||||
excludeFiles: { maxItems: 100, maxLength: 200 },
|
excludeFiles: { maxItems: 100, maxLength: 200 },
|
||||||
rules: { maxItems: 20, minLength: 5, maxLength: 500 },
|
rules: { maxItems: 20, minLength: 5, maxLength: 500 },
|
||||||
previousVersions: { maxItems: 50 },
|
previousVersions: { maxItems: 50 },
|
||||||
versionTag: { pattern: /^v?\d+\.\d+(\.\d+)?(-.*)?$/ },
|
versionTag: { pattern: /^v?\d+\.\d+(\.\d+)?(-.*)?$/ }
|
||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -107,94 +107,96 @@ const CONFIG_CONSTRAINTS = {
|
|||||||
// src/lib/server/config/config-parser.ts
|
// src/lib/server/config/config-parser.ts
|
||||||
|
|
||||||
export interface ParsedConfig {
|
export interface ParsedConfig {
|
||||||
config: TrueRefConfig;
|
config: TrueRefConfig;
|
||||||
source: 'trueref.json' | 'context7.json';
|
source: 'trueref.json' | 'context7.json';
|
||||||
warnings: string[];
|
warnings: string[];
|
||||||
}
|
}
|
||||||
|
|
||||||
export function parseConfigFile(content: string, filename: string): ParsedConfig {
|
export function parseConfigFile(content: string, filename: string): ParsedConfig {
|
||||||
let raw: unknown;
|
let raw: unknown;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
raw = JSON.parse(content);
|
raw = JSON.parse(content);
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
throw new ConfigParseError(`${filename} is not valid JSON: ${(e as Error).message}`);
|
throw new ConfigParseError(`${filename} is not valid JSON: ${(e as Error).message}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (typeof raw !== 'object' || raw === null) {
|
if (typeof raw !== 'object' || raw === null) {
|
||||||
throw new ConfigParseError(`${filename} must be a JSON object`);
|
throw new ConfigParseError(`${filename} must be a JSON object`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const config = raw as Record<string, unknown>;
|
const config = raw as Record<string, unknown>;
|
||||||
const validated: TrueRefConfig = {};
|
const validated: TrueRefConfig = {};
|
||||||
const warnings: string[] = [];
|
const warnings: string[] = [];
|
||||||
|
|
||||||
// projectTitle
|
// projectTitle
|
||||||
if (config.projectTitle !== undefined) {
|
if (config.projectTitle !== undefined) {
|
||||||
if (typeof config.projectTitle !== 'string') {
|
if (typeof config.projectTitle !== 'string') {
|
||||||
warnings.push('projectTitle must be a string, ignoring');
|
warnings.push('projectTitle must be a string, ignoring');
|
||||||
} else if (config.projectTitle.length > 100) {
|
} else if (config.projectTitle.length > 100) {
|
||||||
validated.projectTitle = config.projectTitle.slice(0, 100);
|
validated.projectTitle = config.projectTitle.slice(0, 100);
|
||||||
warnings.push('projectTitle truncated to 100 characters');
|
warnings.push('projectTitle truncated to 100 characters');
|
||||||
} else {
|
} else {
|
||||||
validated.projectTitle = config.projectTitle;
|
validated.projectTitle = config.projectTitle;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// description
|
// description
|
||||||
if (config.description !== undefined) {
|
if (config.description !== undefined) {
|
||||||
if (typeof config.description === 'string') {
|
if (typeof config.description === 'string') {
|
||||||
validated.description = config.description.slice(0, 500);
|
validated.description = config.description.slice(0, 500);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// folders / excludeFolders / excludeFiles — validated as string arrays
|
// folders / excludeFolders / excludeFiles — validated as string arrays
|
||||||
for (const field of ['folders', 'excludeFolders', 'excludeFiles'] as const) {
|
for (const field of ['folders', 'excludeFolders', 'excludeFiles'] as const) {
|
||||||
if (config[field] !== undefined) {
|
if (config[field] !== undefined) {
|
||||||
if (!Array.isArray(config[field])) {
|
if (!Array.isArray(config[field])) {
|
||||||
warnings.push(`${field} must be an array, ignoring`);
|
warnings.push(`${field} must be an array, ignoring`);
|
||||||
} else {
|
} else {
|
||||||
validated[field] = (config[field] as unknown[])
|
validated[field] = (config[field] as unknown[])
|
||||||
.filter((item): item is string => {
|
.filter((item): item is string => {
|
||||||
if (typeof item !== 'string') {
|
if (typeof item !== 'string') {
|
||||||
warnings.push(`${field} entry must be a string, skipping: ${item}`);
|
warnings.push(`${field} entry must be a string, skipping: ${item}`);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
})
|
})
|
||||||
.slice(0, field === 'excludeFiles' ? 100 : 50);
|
.slice(0, field === 'excludeFiles' ? 100 : 50);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// rules
|
// rules
|
||||||
if (config.rules !== undefined) {
|
if (config.rules !== undefined) {
|
||||||
if (Array.isArray(config.rules)) {
|
if (Array.isArray(config.rules)) {
|
||||||
validated.rules = (config.rules as unknown[])
|
validated.rules = (config.rules as unknown[])
|
||||||
.filter((r): r is string => typeof r === 'string' && r.length >= 5)
|
.filter((r): r is string => typeof r === 'string' && r.length >= 5)
|
||||||
.map(r => r.slice(0, 500))
|
.map((r) => r.slice(0, 500))
|
||||||
.slice(0, 20);
|
.slice(0, 20);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// previousVersions
|
// previousVersions
|
||||||
if (config.previousVersions !== undefined) {
|
if (config.previousVersions !== undefined) {
|
||||||
if (Array.isArray(config.previousVersions)) {
|
if (Array.isArray(config.previousVersions)) {
|
||||||
validated.previousVersions = (config.previousVersions as unknown[])
|
validated.previousVersions = (config.previousVersions as unknown[])
|
||||||
.filter((v): v is { tag: string; title: string } =>
|
.filter(
|
||||||
typeof v === 'object' && v !== null &&
|
(v): v is { tag: string; title: string } =>
|
||||||
typeof (v as Record<string, unknown>).tag === 'string' &&
|
typeof v === 'object' &&
|
||||||
typeof (v as Record<string, unknown>).title === 'string'
|
v !== null &&
|
||||||
)
|
typeof (v as Record<string, unknown>).tag === 'string' &&
|
||||||
.slice(0, 50);
|
typeof (v as Record<string, unknown>).title === 'string'
|
||||||
}
|
)
|
||||||
}
|
.slice(0, 50);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
config: validated,
|
config: validated,
|
||||||
source: filename.startsWith('trueref') ? 'trueref.json' : 'context7.json',
|
source: filename.startsWith('trueref') ? 'trueref.json' : 'context7.json',
|
||||||
warnings,
|
warnings
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -219,21 +221,15 @@ When `query-docs` returns results, `rules` from `repository_configs` are prepend
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
// In formatters.ts
|
// In formatters.ts
|
||||||
function buildContextResponse(
|
function buildContextResponse(snippets: Snippet[], config: RepositoryConfig | null): string {
|
||||||
snippets: Snippet[],
|
const parts: string[] = [];
|
||||||
config: RepositoryConfig | null
|
|
||||||
): string {
|
|
||||||
const parts: string[] = [];
|
|
||||||
|
|
||||||
if (config?.rules?.length) {
|
if (config?.rules?.length) {
|
||||||
parts.push(
|
parts.push('## Library Best Practices\n' + config.rules.map((r) => `- ${r}`).join('\n'));
|
||||||
'## Library Best Practices\n' +
|
}
|
||||||
config.rules.map(r => `- ${r}`).join('\n')
|
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
// ... append snippet content
|
// ... append snippet content
|
||||||
return parts.join('\n\n---\n\n');
|
return parts.join('\n\n---\n\n');
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -45,34 +45,37 @@ Examples:
|
|||||||
### `GET /api/v1/libs/:id/versions`
|
### `GET /api/v1/libs/:id/versions`
|
||||||
|
|
||||||
Response `200`:
|
Response `200`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"versions": [
|
"versions": [
|
||||||
{
|
{
|
||||||
"id": "/facebook/react/v18.3.0",
|
"id": "/facebook/react/v18.3.0",
|
||||||
"repositoryId": "/facebook/react",
|
"repositoryId": "/facebook/react",
|
||||||
"tag": "v18.3.0",
|
"tag": "v18.3.0",
|
||||||
"title": "React v18.3.0",
|
"title": "React v18.3.0",
|
||||||
"state": "indexed",
|
"state": "indexed",
|
||||||
"totalSnippets": 892,
|
"totalSnippets": 892,
|
||||||
"indexedAt": "2026-03-22T10:00:00Z"
|
"indexedAt": "2026-03-22T10:00:00Z"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### `POST /api/v1/libs/:id/versions`
|
### `POST /api/v1/libs/:id/versions`
|
||||||
|
|
||||||
Request body:
|
Request body:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"tag": "v18.3.0",
|
"tag": "v18.3.0",
|
||||||
"title": "React v18.3.0",
|
"title": "React v18.3.0",
|
||||||
"autoIndex": true
|
"autoIndex": true
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Response `201`:
|
Response `201`:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"version": { ...RepositoryVersion },
|
"version": { ...RepositoryVersion },
|
||||||
@@ -96,23 +99,22 @@ Response `202` with job details.
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
async function listGitHubTags(
|
async function listGitHubTags(
|
||||||
owner: string,
|
owner: string,
|
||||||
repo: string,
|
repo: string,
|
||||||
token?: string
|
token?: string
|
||||||
): Promise<Array<{ name: string; commit: { sha: string } }>> {
|
): Promise<Array<{ name: string; commit: { sha: string } }>> {
|
||||||
const headers: Record<string, string> = {
|
const headers: Record<string, string> = {
|
||||||
'Accept': 'application/vnd.github.v3+json',
|
Accept: 'application/vnd.github.v3+json',
|
||||||
'User-Agent': 'TrueRef/1.0',
|
'User-Agent': 'TrueRef/1.0'
|
||||||
};
|
};
|
||||||
if (token) headers['Authorization'] = `Bearer ${token}`;
|
if (token) headers['Authorization'] = `Bearer ${token}`;
|
||||||
|
|
||||||
const response = await fetch(
|
const response = await fetch(`https://api.github.com/repos/${owner}/${repo}/tags?per_page=100`, {
|
||||||
`https://api.github.com/repos/${owner}/${repo}/tags?per_page=100`,
|
headers
|
||||||
{ headers }
|
});
|
||||||
);
|
|
||||||
|
|
||||||
if (!response.ok) throw new GitHubApiError(response.status);
|
if (!response.ok) throw new GitHubApiError(response.status);
|
||||||
return response.json();
|
return response.json();
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -124,28 +126,26 @@ In the search/context endpoints, the `libraryId` is parsed to extract the option
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
function resolveSearchTarget(libraryId: string): {
|
function resolveSearchTarget(libraryId: string): {
|
||||||
repositoryId: string;
|
repositoryId: string;
|
||||||
versionId?: string;
|
versionId?: string;
|
||||||
} {
|
} {
|
||||||
const { repositoryId, version } = parseLibraryId(libraryId);
|
const { repositoryId, version } = parseLibraryId(libraryId);
|
||||||
|
|
||||||
if (!version) {
|
if (!version) {
|
||||||
// Query default branch: versionId = NULL
|
// Query default branch: versionId = NULL
|
||||||
return { repositoryId };
|
return { repositoryId };
|
||||||
}
|
}
|
||||||
|
|
||||||
// Look up versionId from tag
|
// Look up versionId from tag
|
||||||
const versionRecord = db.prepare(
|
const versionRecord = db
|
||||||
`SELECT id FROM repository_versions WHERE repository_id = ? AND tag = ?`
|
.prepare(`SELECT id FROM repository_versions WHERE repository_id = ? AND tag = ?`)
|
||||||
).get(repositoryId, version) as { id: string } | undefined;
|
.get(repositoryId, version) as { id: string } | undefined;
|
||||||
|
|
||||||
if (!versionRecord) {
|
if (!versionRecord) {
|
||||||
throw new NotFoundError(
|
throw new NotFoundError(`Version "${version}" not found for library "${repositoryId}"`);
|
||||||
`Version "${version}" not found for library "${repositoryId}"`
|
}
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
return { repositoryId, versionId: versionRecord.id };
|
return { repositoryId, versionId: versionRecord.id };
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -157,20 +157,20 @@ Snippets with `version_id IS NULL` belong to the default branch; snippets with a
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
export class VersionService {
|
export class VersionService {
|
||||||
constructor(private db: BetterSQLite3.Database) {}
|
constructor(private db: BetterSQLite3.Database) {}
|
||||||
|
|
||||||
list(repositoryId: string): RepositoryVersion[]
|
list(repositoryId: string): RepositoryVersion[];
|
||||||
|
|
||||||
add(repositoryId: string, tag: string, title?: string): RepositoryVersion
|
add(repositoryId: string, tag: string, title?: string): RepositoryVersion;
|
||||||
|
|
||||||
remove(repositoryId: string, tag: string): void
|
remove(repositoryId: string, tag: string): void;
|
||||||
|
|
||||||
getByTag(repositoryId: string, tag: string): RepositoryVersion | null
|
getByTag(repositoryId: string, tag: string): RepositoryVersion | null;
|
||||||
|
|
||||||
registerFromConfig(
|
registerFromConfig(
|
||||||
repositoryId: string,
|
repositoryId: string,
|
||||||
previousVersions: { tag: string; title: string }[]
|
previousVersions: { tag: string; title: string }[]
|
||||||
): RepositoryVersion[]
|
): RepositoryVersion[];
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -49,79 +49,79 @@ Implement the main web interface for managing repositories. Built with SvelteKit
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/lib/components/RepositoryCard.svelte -->
|
<!-- src/lib/components/RepositoryCard.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
import type { Repository } from '$lib/types';
|
import type { Repository } from '$lib/types';
|
||||||
|
|
||||||
let { repo, onReindex, onDelete } = $props<{
|
let { repo, onReindex, onDelete } = $props<{
|
||||||
repo: Repository;
|
repo: Repository;
|
||||||
onReindex: (id: string) => void;
|
onReindex: (id: string) => void;
|
||||||
onDelete: (id: string) => void;
|
onDelete: (id: string) => void;
|
||||||
}>();
|
}>();
|
||||||
|
|
||||||
const stateColors = {
|
const stateColors = {
|
||||||
pending: 'bg-gray-100 text-gray-600',
|
pending: 'bg-gray-100 text-gray-600',
|
||||||
indexing: 'bg-blue-100 text-blue-700',
|
indexing: 'bg-blue-100 text-blue-700',
|
||||||
indexed: 'bg-green-100 text-green-700',
|
indexed: 'bg-green-100 text-green-700',
|
||||||
error: 'bg-red-100 text-red-700',
|
error: 'bg-red-100 text-red-700'
|
||||||
};
|
};
|
||||||
|
|
||||||
const stateLabels = {
|
const stateLabels = {
|
||||||
pending: 'Pending',
|
pending: 'Pending',
|
||||||
indexing: 'Indexing...',
|
indexing: 'Indexing...',
|
||||||
indexed: 'Indexed',
|
indexed: 'Indexed',
|
||||||
error: 'Error',
|
error: 'Error'
|
||||||
};
|
};
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<div class="rounded-xl border border-gray-200 bg-white p-5 shadow-sm">
|
<div class="rounded-xl border border-gray-200 bg-white p-5 shadow-sm">
|
||||||
<div class="flex items-start justify-between">
|
<div class="flex items-start justify-between">
|
||||||
<div>
|
<div>
|
||||||
<h3 class="font-semibold text-gray-900">{repo.title}</h3>
|
<h3 class="font-semibold text-gray-900">{repo.title}</h3>
|
||||||
<p class="mt-0.5 font-mono text-sm text-gray-500">{repo.id}</p>
|
<p class="mt-0.5 font-mono text-sm text-gray-500">{repo.id}</p>
|
||||||
</div>
|
</div>
|
||||||
<span class="rounded-full px-2.5 py-0.5 text-xs font-medium {stateColors[repo.state]}">
|
<span class="rounded-full px-2.5 py-0.5 text-xs font-medium {stateColors[repo.state]}">
|
||||||
{stateLabels[repo.state]}
|
{stateLabels[repo.state]}
|
||||||
</span>
|
</span>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if repo.description}
|
{#if repo.description}
|
||||||
<p class="mt-2 line-clamp-2 text-sm text-gray-600">{repo.description}</p>
|
<p class="mt-2 line-clamp-2 text-sm text-gray-600">{repo.description}</p>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
<div class="mt-4 flex gap-4 text-sm text-gray-500">
|
<div class="mt-4 flex gap-4 text-sm text-gray-500">
|
||||||
<span>{repo.totalSnippets.toLocaleString()} snippets</span>
|
<span>{repo.totalSnippets.toLocaleString()} snippets</span>
|
||||||
<span>·</span>
|
<span>·</span>
|
||||||
<span>Trust: {repo.trustScore.toFixed(1)}/10</span>
|
<span>Trust: {repo.trustScore.toFixed(1)}/10</span>
|
||||||
{#if repo.stars}
|
{#if repo.stars}
|
||||||
<span>·</span>
|
<span>·</span>
|
||||||
<span>★ {repo.stars.toLocaleString()}</span>
|
<span>★ {repo.stars.toLocaleString()}</span>
|
||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if repo.state === 'error'}
|
{#if repo.state === 'error'}
|
||||||
<p class="mt-2 text-xs text-red-600">Indexing failed. Check jobs for details.</p>
|
<p class="mt-2 text-xs text-red-600">Indexing failed. Check jobs for details.</p>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
<div class="mt-4 flex gap-2">
|
<div class="mt-4 flex gap-2">
|
||||||
<button
|
<button
|
||||||
onclick={() => onReindex(repo.id)}
|
onclick={() => onReindex(repo.id)}
|
||||||
class="rounded-lg bg-blue-600 px-3 py-1.5 text-sm text-white hover:bg-blue-700"
|
class="rounded-lg bg-blue-600 px-3 py-1.5 text-sm text-white hover:bg-blue-700"
|
||||||
disabled={repo.state === 'indexing'}
|
disabled={repo.state === 'indexing'}
|
||||||
>
|
>
|
||||||
{repo.state === 'indexing' ? 'Indexing...' : 'Re-index'}
|
{repo.state === 'indexing' ? 'Indexing...' : 'Re-index'}
|
||||||
</button>
|
</button>
|
||||||
<a
|
<a
|
||||||
href="/repos/{encodeURIComponent(repo.id)}"
|
href="/repos/{encodeURIComponent(repo.id)}"
|
||||||
class="rounded-lg border border-gray-200 px-3 py-1.5 text-sm text-gray-700 hover:bg-gray-50"
|
class="rounded-lg border border-gray-200 px-3 py-1.5 text-sm text-gray-700 hover:bg-gray-50"
|
||||||
>
|
>
|
||||||
Details
|
Details
|
||||||
</a>
|
</a>
|
||||||
<button
|
<button
|
||||||
onclick={() => onDelete(repo.id)}
|
onclick={() => onDelete(repo.id)}
|
||||||
class="ml-auto rounded-lg px-3 py-1.5 text-sm text-red-600 hover:bg-red-50"
|
class="ml-auto rounded-lg px-3 py-1.5 text-sm text-red-600 hover:bg-red-50"
|
||||||
>
|
>
|
||||||
Delete
|
Delete
|
||||||
</button>
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -132,98 +132,104 @@ Implement the main web interface for managing repositories. Built with SvelteKit
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/lib/components/AddRepositoryModal.svelte -->
|
<!-- src/lib/components/AddRepositoryModal.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
let { onClose, onAdded } = $props<{
|
let { onClose, onAdded } = $props<{
|
||||||
onClose: () => void;
|
onClose: () => void;
|
||||||
onAdded: () => void;
|
onAdded: () => void;
|
||||||
}>();
|
}>();
|
||||||
|
|
||||||
let source = $state<'github' | 'local'>('github');
|
let source = $state<'github' | 'local'>('github');
|
||||||
let sourceUrl = $state('');
|
let sourceUrl = $state('');
|
||||||
let githubToken = $state('');
|
let githubToken = $state('');
|
||||||
let loading = $state(false);
|
let loading = $state(false);
|
||||||
let error = $state<string | null>(null);
|
let error = $state<string | null>(null);
|
||||||
|
|
||||||
async function handleSubmit() {
|
async function handleSubmit() {
|
||||||
loading = true;
|
loading = true;
|
||||||
error = null;
|
error = null;
|
||||||
try {
|
try {
|
||||||
const res = await fetch('/api/v1/libs', {
|
const res = await fetch('/api/v1/libs', {
|
||||||
method: 'POST',
|
method: 'POST',
|
||||||
headers: { 'Content-Type': 'application/json' },
|
headers: { 'Content-Type': 'application/json' },
|
||||||
body: JSON.stringify({ source, sourceUrl, githubToken: githubToken || undefined }),
|
body: JSON.stringify({ source, sourceUrl, githubToken: githubToken || undefined })
|
||||||
});
|
});
|
||||||
if (!res.ok) {
|
if (!res.ok) {
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
throw new Error(data.error ?? 'Failed to add repository');
|
throw new Error(data.error ?? 'Failed to add repository');
|
||||||
}
|
}
|
||||||
onAdded();
|
onAdded();
|
||||||
onClose();
|
onClose();
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
error = (e as Error).message;
|
error = (e as Error).message;
|
||||||
} finally {
|
} finally {
|
||||||
loading = false;
|
loading = false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<dialog class="modal" open>
|
<dialog class="modal" open>
|
||||||
<div class="modal-box max-w-md">
|
<div class="modal-box max-w-md">
|
||||||
<h2 class="mb-4 text-lg font-semibold">Add Repository</h2>
|
<h2 class="mb-4 text-lg font-semibold">Add Repository</h2>
|
||||||
|
|
||||||
<div class="mb-4 flex gap-2">
|
<div class="mb-4 flex gap-2">
|
||||||
<button
|
<button
|
||||||
class="flex-1 rounded-lg py-2 text-sm {source === 'github' ? 'bg-blue-600 text-white' : 'border border-gray-200 text-gray-700'}"
|
class="flex-1 rounded-lg py-2 text-sm {source === 'github'
|
||||||
onclick={() => source = 'github'}
|
? 'bg-blue-600 text-white'
|
||||||
>GitHub</button>
|
: 'border border-gray-200 text-gray-700'}"
|
||||||
<button
|
onclick={() => (source = 'github')}>GitHub</button
|
||||||
class="flex-1 rounded-lg py-2 text-sm {source === 'local' ? 'bg-blue-600 text-white' : 'border border-gray-200 text-gray-700'}"
|
>
|
||||||
onclick={() => source = 'local'}
|
<button
|
||||||
>Local Path</button>
|
class="flex-1 rounded-lg py-2 text-sm {source === 'local'
|
||||||
</div>
|
? 'bg-blue-600 text-white'
|
||||||
|
: 'border border-gray-200 text-gray-700'}"
|
||||||
|
onclick={() => (source = 'local')}>Local Path</button
|
||||||
|
>
|
||||||
|
</div>
|
||||||
|
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium text-gray-700">
|
<span class="text-sm font-medium text-gray-700">
|
||||||
{source === 'github' ? 'GitHub URL' : 'Absolute Path'}
|
{source === 'github' ? 'GitHub URL' : 'Absolute Path'}
|
||||||
</span>
|
</span>
|
||||||
<input
|
<input
|
||||||
type="text"
|
type="text"
|
||||||
bind:value={sourceUrl}
|
bind:value={sourceUrl}
|
||||||
placeholder={source === 'github'
|
placeholder={source === 'github'
|
||||||
? 'https://github.com/facebook/react'
|
? 'https://github.com/facebook/react'
|
||||||
: '/home/user/projects/my-sdk'}
|
: '/home/user/projects/my-sdk'}
|
||||||
class="mt-1 w-full rounded-lg border border-gray-300 px-3 py-2 text-sm"
|
class="mt-1 w-full rounded-lg border border-gray-300 px-3 py-2 text-sm"
|
||||||
/>
|
/>
|
||||||
</label>
|
</label>
|
||||||
|
|
||||||
{#if source === 'github'}
|
{#if source === 'github'}
|
||||||
<label class="mt-3 block">
|
<label class="mt-3 block">
|
||||||
<span class="text-sm font-medium text-gray-700">GitHub Token (optional, for private repos)</span>
|
<span class="text-sm font-medium text-gray-700"
|
||||||
<input
|
>GitHub Token (optional, for private repos)</span
|
||||||
type="password"
|
>
|
||||||
bind:value={githubToken}
|
<input
|
||||||
placeholder="ghp_..."
|
type="password"
|
||||||
class="mt-1 w-full rounded-lg border border-gray-300 px-3 py-2 text-sm"
|
bind:value={githubToken}
|
||||||
/>
|
placeholder="ghp_..."
|
||||||
</label>
|
class="mt-1 w-full rounded-lg border border-gray-300 px-3 py-2 text-sm"
|
||||||
{/if}
|
/>
|
||||||
|
</label>
|
||||||
|
{/if}
|
||||||
|
|
||||||
{#if error}
|
{#if error}
|
||||||
<p class="mt-3 text-sm text-red-600">{error}</p>
|
<p class="mt-3 text-sm text-red-600">{error}</p>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
<div class="mt-6 flex justify-end gap-3">
|
<div class="mt-6 flex justify-end gap-3">
|
||||||
<button onclick={onClose} class="rounded-lg border border-gray-200 px-4 py-2 text-sm">
|
<button onclick={onClose} class="rounded-lg border border-gray-200 px-4 py-2 text-sm">
|
||||||
Cancel
|
Cancel
|
||||||
</button>
|
</button>
|
||||||
<button
|
<button
|
||||||
onclick={handleSubmit}
|
onclick={handleSubmit}
|
||||||
disabled={loading || !sourceUrl}
|
disabled={loading || !sourceUrl}
|
||||||
class="rounded-lg bg-blue-600 px-4 py-2 text-sm text-white disabled:opacity-50"
|
class="rounded-lg bg-blue-600 px-4 py-2 text-sm text-white disabled:opacity-50"
|
||||||
>
|
>
|
||||||
{loading ? 'Adding...' : 'Add & Index'}
|
{loading ? 'Adding...' : 'Add & Index'}
|
||||||
</button>
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</dialog>
|
</dialog>
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -234,48 +240,48 @@ Implement the main web interface for managing repositories. Built with SvelteKit
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/lib/components/IndexingProgress.svelte -->
|
<!-- src/lib/components/IndexingProgress.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
import { onMount, onDestroy } from 'svelte';
|
import { onMount, onDestroy } from 'svelte';
|
||||||
import type { IndexingJob } from '$lib/types';
|
import type { IndexingJob } from '$lib/types';
|
||||||
|
|
||||||
let { jobId } = $props<{ jobId: string }>();
|
let { jobId } = $props<{ jobId: string }>();
|
||||||
let job = $state<IndexingJob | null>(null);
|
let job = $state<IndexingJob | null>(null);
|
||||||
let interval: ReturnType<typeof setInterval>;
|
let interval: ReturnType<typeof setInterval>;
|
||||||
|
|
||||||
async function pollJob() {
|
async function pollJob() {
|
||||||
const res = await fetch(`/api/v1/jobs/${jobId}`);
|
const res = await fetch(`/api/v1/jobs/${jobId}`);
|
||||||
if (res.ok) {
|
if (res.ok) {
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
job = data.job;
|
job = data.job;
|
||||||
if (job?.status === 'done' || job?.status === 'failed') {
|
if (job?.status === 'done' || job?.status === 'failed') {
|
||||||
clearInterval(interval);
|
clearInterval(interval);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
onMount(() => {
|
onMount(() => {
|
||||||
pollJob();
|
pollJob();
|
||||||
interval = setInterval(pollJob, 2000);
|
interval = setInterval(pollJob, 2000);
|
||||||
});
|
});
|
||||||
|
|
||||||
onDestroy(() => clearInterval(interval));
|
onDestroy(() => clearInterval(interval));
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
{#if job}
|
{#if job}
|
||||||
<div class="mt-2">
|
<div class="mt-2">
|
||||||
<div class="flex justify-between text-xs text-gray-500">
|
<div class="flex justify-between text-xs text-gray-500">
|
||||||
<span>{job.processedFiles} / {job.totalFiles} files</span>
|
<span>{job.processedFiles} / {job.totalFiles} files</span>
|
||||||
<span>{job.progress}%</span>
|
<span>{job.progress}%</span>
|
||||||
</div>
|
</div>
|
||||||
<div class="mt-1 h-1.5 w-full rounded-full bg-gray-200">
|
<div class="mt-1 h-1.5 w-full rounded-full bg-gray-200">
|
||||||
<div
|
<div
|
||||||
class="h-1.5 rounded-full bg-blue-600 transition-all"
|
class="h-1.5 rounded-full bg-blue-600 transition-all"
|
||||||
style="width: {job.progress}%"
|
style="width: {job.progress}%"
|
||||||
></div>
|
></div>
|
||||||
</div>
|
</div>
|
||||||
{#if job.status === 'failed'}
|
{#if job.status === 'failed'}
|
||||||
<p class="mt-1 text-xs text-red-600">{job.error}</p>
|
<p class="mt-1 text-xs text-red-600">{job.error}</p>
|
||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
{/if}
|
{/if}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -288,9 +294,9 @@ Implement the main web interface for managing repositories. Built with SvelteKit
|
|||||||
import type { PageServerLoad } from './$types';
|
import type { PageServerLoad } from './$types';
|
||||||
|
|
||||||
export const load: PageServerLoad = async ({ fetch }) => {
|
export const load: PageServerLoad = async ({ fetch }) => {
|
||||||
const res = await fetch('/api/v1/libs');
|
const res = await fetch('/api/v1/libs');
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
return { repositories: data.libraries };
|
return { repositories: data.libraries };
|
||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -57,25 +57,31 @@ An interactive search interface within the web UI that lets users test the docum
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/lib/components/search/LibraryResult.svelte -->
|
<!-- src/lib/components/search/LibraryResult.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
let { result, onSelect } = $props<{
|
let { result, onSelect } = $props<{
|
||||||
result: { id: string; title: string; description: string; totalSnippets: number; trustScore: number };
|
result: {
|
||||||
onSelect: (id: string) => void;
|
id: string;
|
||||||
}>();
|
title: string;
|
||||||
|
description: string;
|
||||||
|
totalSnippets: number;
|
||||||
|
trustScore: number;
|
||||||
|
};
|
||||||
|
onSelect: (id: string) => void;
|
||||||
|
}>();
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<button
|
<button
|
||||||
onclick={() => onSelect(result.id)}
|
onclick={() => onSelect(result.id)}
|
||||||
class="w-full rounded-xl border border-gray-200 bg-white p-4 text-left shadow-sm hover:border-blue-300 hover:shadow-md transition-all"
|
class="w-full rounded-xl border border-gray-200 bg-white p-4 text-left shadow-sm transition-all hover:border-blue-300 hover:shadow-md"
|
||||||
>
|
>
|
||||||
<div class="flex items-center justify-between">
|
<div class="flex items-center justify-between">
|
||||||
<span class="font-semibold text-gray-900">{result.title}</span>
|
<span class="font-semibold text-gray-900">{result.title}</span>
|
||||||
<span class="text-xs text-gray-400">Trust {result.trustScore.toFixed(1)}/10</span>
|
<span class="text-xs text-gray-400">Trust {result.trustScore.toFixed(1)}/10</span>
|
||||||
</div>
|
</div>
|
||||||
<p class="font-mono text-xs text-gray-400">{result.id}</p>
|
<p class="font-mono text-xs text-gray-400">{result.id}</p>
|
||||||
{#if result.description}
|
{#if result.description}
|
||||||
<p class="mt-1.5 text-sm text-gray-600 line-clamp-2">{result.description}</p>
|
<p class="mt-1.5 line-clamp-2 text-sm text-gray-600">{result.description}</p>
|
||||||
{/if}
|
{/if}
|
||||||
<p class="mt-2 text-xs text-gray-400">{result.totalSnippets.toLocaleString()} snippets</p>
|
<p class="mt-2 text-xs text-gray-400">{result.totalSnippets.toLocaleString()} snippets</p>
|
||||||
</button>
|
</button>
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -86,37 +92,39 @@ An interactive search interface within the web UI that lets users test the docum
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/lib/components/search/SnippetCard.svelte -->
|
<!-- src/lib/components/search/SnippetCard.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
import type { Snippet } from '$lib/types';
|
import type { Snippet } from '$lib/types';
|
||||||
|
|
||||||
let { snippet } = $props<{ snippet: Snippet }>();
|
let { snippet } = $props<{ snippet: Snippet }>();
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<div class="rounded-xl border border-gray-200 bg-white overflow-hidden">
|
<div class="overflow-hidden rounded-xl border border-gray-200 bg-white">
|
||||||
<div class="flex items-center justify-between border-b border-gray-100 px-4 py-2.5">
|
<div class="flex items-center justify-between border-b border-gray-100 px-4 py-2.5">
|
||||||
<div class="flex items-center gap-2">
|
<div class="flex items-center gap-2">
|
||||||
{#if snippet.type === 'code'}
|
{#if snippet.type === 'code'}
|
||||||
<span class="rounded bg-purple-100 px-1.5 py-0.5 text-xs text-purple-700">code</span>
|
<span class="rounded bg-purple-100 px-1.5 py-0.5 text-xs text-purple-700">code</span>
|
||||||
{:else}
|
{:else}
|
||||||
<span class="rounded bg-blue-100 px-1.5 py-0.5 text-xs text-blue-700">info</span>
|
<span class="rounded bg-blue-100 px-1.5 py-0.5 text-xs text-blue-700">info</span>
|
||||||
{/if}
|
{/if}
|
||||||
{#if snippet.title}
|
{#if snippet.title}
|
||||||
<span class="text-sm font-medium text-gray-800">{snippet.title}</span>
|
<span class="text-sm font-medium text-gray-800">{snippet.title}</span>
|
||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
<span class="text-xs text-gray-400">{snippet.tokenCount} tokens</span>
|
<span class="text-xs text-gray-400">{snippet.tokenCount} tokens</span>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if snippet.breadcrumb}
|
{#if snippet.breadcrumb}
|
||||||
<p class="bg-gray-50 px-4 py-1.5 text-xs text-gray-500 italic">{snippet.breadcrumb}</p>
|
<p class="bg-gray-50 px-4 py-1.5 text-xs text-gray-500 italic">{snippet.breadcrumb}</p>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
<div class="p-4">
|
<div class="p-4">
|
||||||
{#if snippet.type === 'code'}
|
{#if snippet.type === 'code'}
|
||||||
<pre class="overflow-x-auto rounded bg-gray-950 p-4 text-sm text-gray-100"><code>{snippet.content}</code></pre>
|
<pre class="overflow-x-auto rounded bg-gray-950 p-4 text-sm text-gray-100"><code
|
||||||
{:else}
|
>{snippet.content}</code
|
||||||
<div class="prose prose-sm max-w-none text-gray-700">{snippet.content}</div>
|
></pre>
|
||||||
{/if}
|
{:else}
|
||||||
</div>
|
<div class="prose prose-sm max-w-none text-gray-700">{snippet.content}</div>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -127,44 +135,44 @@ An interactive search interface within the web UI that lets users test the docum
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/routes/search/+page.svelte -->
|
<!-- src/routes/search/+page.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
import { page } from '$app/stores';
|
import { page } from '$app/stores';
|
||||||
import { goto } from '$app/navigation';
|
import { goto } from '$app/navigation';
|
||||||
|
|
||||||
let libraryName = $state('');
|
let libraryName = $state('');
|
||||||
let selectedLibraryId = $state<string | null>(null);
|
let selectedLibraryId = $state<string | null>(null);
|
||||||
let query = $state('');
|
let query = $state('');
|
||||||
let libraryResults = $state<LibrarySearchResult[]>([]);
|
let libraryResults = $state<LibrarySearchResult[]>([]);
|
||||||
let snippets = $state<Snippet[]>([]);
|
let snippets = $state<Snippet[]>([]);
|
||||||
let loadingLibraries = $state(false);
|
let loadingLibraries = $state(false);
|
||||||
let loadingSnippets = $state(false);
|
let loadingSnippets = $state(false);
|
||||||
|
|
||||||
async function searchLibraries() {
|
async function searchLibraries() {
|
||||||
loadingLibraries = true;
|
loadingLibraries = true;
|
||||||
const res = await fetch(
|
const res = await fetch(
|
||||||
`/api/v1/libs/search?libraryName=${encodeURIComponent(libraryName)}&query=${encodeURIComponent(query)}`
|
`/api/v1/libs/search?libraryName=${encodeURIComponent(libraryName)}&query=${encodeURIComponent(query)}`
|
||||||
);
|
);
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
libraryResults = data.results;
|
libraryResults = data.results;
|
||||||
loadingLibraries = false;
|
loadingLibraries = false;
|
||||||
}
|
}
|
||||||
|
|
||||||
async function searchDocs() {
|
async function searchDocs() {
|
||||||
if (!selectedLibraryId) return;
|
if (!selectedLibraryId) return;
|
||||||
loadingSnippets = true;
|
loadingSnippets = true;
|
||||||
const url = new URL('/api/v1/context', window.location.origin);
|
const url = new URL('/api/v1/context', window.location.origin);
|
||||||
url.searchParams.set('libraryId', selectedLibraryId);
|
url.searchParams.set('libraryId', selectedLibraryId);
|
||||||
url.searchParams.set('query', query);
|
url.searchParams.set('query', query);
|
||||||
const res = await fetch(url);
|
const res = await fetch(url);
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
snippets = data.snippets;
|
snippets = data.snippets;
|
||||||
loadingSnippets = false;
|
loadingSnippets = false;
|
||||||
|
|
||||||
// Update URL
|
// Update URL
|
||||||
goto(`/search?lib=${encodeURIComponent(selectedLibraryId)}&q=${encodeURIComponent(query)}`, {
|
goto(`/search?lib=${encodeURIComponent(selectedLibraryId)}&q=${encodeURIComponent(query)}`, {
|
||||||
replaceState: true,
|
replaceState: true,
|
||||||
keepFocus: true,
|
keepFocus: true
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
</script>
|
</script>
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -177,9 +185,9 @@ Use a minimal, zero-dependency approach for v1 — wrap code blocks in `<pre><co
|
|||||||
```typescript
|
```typescript
|
||||||
// Optional: lazy-load highlight.js only when code snippets are present
|
// Optional: lazy-load highlight.js only when code snippets are present
|
||||||
async function highlightCode(code: string, language: string): Promise<string> {
|
async function highlightCode(code: string, language: string): Promise<string> {
|
||||||
const hljs = await import('highlight.js/lib/core');
|
const hljs = await import('highlight.js/lib/core');
|
||||||
// Register only needed languages
|
// Register only needed languages
|
||||||
return hljs.highlight(code, { language }).value;
|
return hljs.highlight(code, { language }).value;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -30,39 +30,39 @@ Optimize re-indexing by skipping files that haven't changed since the last index
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface FileDiff {
|
interface FileDiff {
|
||||||
added: CrawledFile[]; // new files not in DB
|
added: CrawledFile[]; // new files not in DB
|
||||||
modified: CrawledFile[]; // files with changed checksum
|
modified: CrawledFile[]; // files with changed checksum
|
||||||
deleted: string[]; // file paths in DB but not in crawl
|
deleted: string[]; // file paths in DB but not in crawl
|
||||||
unchanged: string[]; // file paths with matching checksum
|
unchanged: string[]; // file paths with matching checksum
|
||||||
}
|
}
|
||||||
|
|
||||||
function computeDiff(
|
function computeDiff(
|
||||||
crawledFiles: CrawledFile[],
|
crawledFiles: CrawledFile[],
|
||||||
existingDocs: Document[] // documents currently in DB for this repo
|
existingDocs: Document[] // documents currently in DB for this repo
|
||||||
): FileDiff {
|
): FileDiff {
|
||||||
const existingMap = new Map(existingDocs.map(d => [d.filePath, d]));
|
const existingMap = new Map(existingDocs.map((d) => [d.filePath, d]));
|
||||||
const crawledMap = new Map(crawledFiles.map(f => [f.path, f]));
|
const crawledMap = new Map(crawledFiles.map((f) => [f.path, f]));
|
||||||
|
|
||||||
const added: CrawledFile[] = [];
|
const added: CrawledFile[] = [];
|
||||||
const modified: CrawledFile[] = [];
|
const modified: CrawledFile[] = [];
|
||||||
const unchanged: string[] = [];
|
const unchanged: string[] = [];
|
||||||
|
|
||||||
for (const file of crawledFiles) {
|
for (const file of crawledFiles) {
|
||||||
const existing = existingMap.get(file.path);
|
const existing = existingMap.get(file.path);
|
||||||
if (!existing) {
|
if (!existing) {
|
||||||
added.push(file);
|
added.push(file);
|
||||||
} else if (existing.checksum !== file.sha) {
|
} else if (existing.checksum !== file.sha) {
|
||||||
modified.push(file);
|
modified.push(file);
|
||||||
} else {
|
} else {
|
||||||
unchanged.push(file.path);
|
unchanged.push(file.path);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
const deleted = existingDocs
|
const deleted = existingDocs
|
||||||
.filter(doc => !crawledMap.has(doc.filePath))
|
.filter((doc) => !crawledMap.has(doc.filePath))
|
||||||
.map(doc => doc.filePath);
|
.map((doc) => doc.filePath);
|
||||||
|
|
||||||
return { added, modified, deleted, unchanged };
|
return { added, modified, deleted, unchanged };
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -78,7 +78,7 @@ const diff = computeDiff(crawledResult.files, existingDocs);
|
|||||||
|
|
||||||
// Log diff summary
|
// Log diff summary
|
||||||
this.updateJob(job.id, {
|
this.updateJob(job.id, {
|
||||||
totalFiles: crawledResult.files.length,
|
totalFiles: crawledResult.files.length
|
||||||
});
|
});
|
||||||
|
|
||||||
// Process only changed/new files
|
// Process only changed/new files
|
||||||
@@ -89,29 +89,29 @@ const docIdsToDelete: string[] = [];
|
|||||||
|
|
||||||
// Map modified files to their existing document IDs for deletion
|
// Map modified files to their existing document IDs for deletion
|
||||||
for (const file of diff.modified) {
|
for (const file of diff.modified) {
|
||||||
const existing = existingDocs.find(d => d.filePath === file.path);
|
const existing = existingDocs.find((d) => d.filePath === file.path);
|
||||||
if (existing) docIdsToDelete.push(existing.id);
|
if (existing) docIdsToDelete.push(existing.id);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Map deleted file paths to document IDs
|
// Map deleted file paths to document IDs
|
||||||
for (const filePath of diff.deleted) {
|
for (const filePath of diff.deleted) {
|
||||||
const existing = existingDocs.find(d => d.filePath === filePath);
|
const existing = existingDocs.find((d) => d.filePath === filePath);
|
||||||
if (existing) docIdsToDelete.push(existing.id);
|
if (existing) docIdsToDelete.push(existing.id);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Parse new/modified files
|
// Parse new/modified files
|
||||||
for (const [i, file] of filesToProcess.entries()) {
|
for (const [i, file] of filesToProcess.entries()) {
|
||||||
const docId = crypto.randomUUID();
|
const docId = crypto.randomUUID();
|
||||||
newDocuments.push({ id: docId, ...buildDocument(file, repo.id, job.versionId) });
|
newDocuments.push({ id: docId, ...buildDocument(file, repo.id, job.versionId) });
|
||||||
newSnippets.push(...parseFile(file, { repositoryId: repo.id, documentId: docId }));
|
newSnippets.push(...parseFile(file, { repositoryId: repo.id, documentId: docId }));
|
||||||
|
|
||||||
// Count ALL files (including skipped) in progress
|
// Count ALL files (including skipped) in progress
|
||||||
const totalProcessed = diff.unchanged.length + i + 1;
|
const totalProcessed = diff.unchanged.length + i + 1;
|
||||||
const progress = Math.round((totalProcessed / crawledResult.files.length) * 80);
|
const progress = Math.round((totalProcessed / crawledResult.files.length) * 80);
|
||||||
this.updateJob(job.id, {
|
this.updateJob(job.id, {
|
||||||
processedFiles: totalProcessed,
|
processedFiles: totalProcessed,
|
||||||
progress,
|
progress
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
// Atomic replacement of only changed documents
|
// Atomic replacement of only changed documents
|
||||||
@@ -123,6 +123,7 @@ this.replaceSnippets(repo.id, docIdsToDelete, newDocuments, newSnippets);
|
|||||||
## Performance Impact
|
## Performance Impact
|
||||||
|
|
||||||
For a typical repository with 1,000 files where 50 changed:
|
For a typical repository with 1,000 files where 50 changed:
|
||||||
|
|
||||||
- **Without incremental**: 1,000 files parsed + 1,000 embed batches
|
- **Without incremental**: 1,000 files parsed + 1,000 embed batches
|
||||||
- **With incremental**: 50 files parsed + 50 embed batches
|
- **With incremental**: 50 files parsed + 50 embed batches
|
||||||
- Estimated speedup: ~20x for re-indexing
|
- Estimated speedup: ~20x for re-indexing
|
||||||
|
|||||||
@@ -32,24 +32,24 @@ A settings page within the web UI that allows users to configure the embedding p
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const PROVIDER_PRESETS = [
|
const PROVIDER_PRESETS = [
|
||||||
{
|
{
|
||||||
name: 'OpenAI',
|
name: 'OpenAI',
|
||||||
baseUrl: 'https://api.openai.com/v1',
|
baseUrl: 'https://api.openai.com/v1',
|
||||||
model: 'text-embedding-3-small',
|
model: 'text-embedding-3-small',
|
||||||
dimensions: 1536,
|
dimensions: 1536
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
name: 'Ollama (local)',
|
name: 'Ollama (local)',
|
||||||
baseUrl: 'http://localhost:11434/v1',
|
baseUrl: 'http://localhost:11434/v1',
|
||||||
model: 'nomic-embed-text',
|
model: 'nomic-embed-text',
|
||||||
dimensions: 768,
|
dimensions: 768
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
name: 'Azure OpenAI',
|
name: 'Azure OpenAI',
|
||||||
baseUrl: 'https://{resource}.openai.azure.com/openai/deployments/{deployment}/v1',
|
baseUrl: 'https://{resource}.openai.azure.com/openai/deployments/{deployment}/v1',
|
||||||
model: 'text-embedding-3-small',
|
model: 'text-embedding-3-small',
|
||||||
dimensions: 1536,
|
dimensions: 1536
|
||||||
},
|
}
|
||||||
];
|
];
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -60,133 +60,157 @@ const PROVIDER_PRESETS = [
|
|||||||
```svelte
|
```svelte
|
||||||
<!-- src/routes/settings/+page.svelte -->
|
<!-- src/routes/settings/+page.svelte -->
|
||||||
<script lang="ts">
|
<script lang="ts">
|
||||||
let provider = $state<'none' | 'openai' | 'local'>('none');
|
let provider = $state<'none' | 'openai' | 'local'>('none');
|
||||||
let baseUrl = $state('https://api.openai.com/v1');
|
let baseUrl = $state('https://api.openai.com/v1');
|
||||||
let apiKey = $state('');
|
let apiKey = $state('');
|
||||||
let model = $state('text-embedding-3-small');
|
let model = $state('text-embedding-3-small');
|
||||||
let dimensions = $state<number | undefined>(1536);
|
let dimensions = $state<number | undefined>(1536);
|
||||||
let testStatus = $state<'idle' | 'testing' | 'ok' | 'error'>('idle');
|
let testStatus = $state<'idle' | 'testing' | 'ok' | 'error'>('idle');
|
||||||
let testError = $state<string | null>(null);
|
let testError = $state<string | null>(null);
|
||||||
let saving = $state(false);
|
let saving = $state(false);
|
||||||
|
|
||||||
async function testConnection() {
|
async function testConnection() {
|
||||||
testStatus = 'testing';
|
testStatus = 'testing';
|
||||||
testError = null;
|
testError = null;
|
||||||
try {
|
try {
|
||||||
const res = await fetch('/api/v1/settings/embedding/test', {
|
const res = await fetch('/api/v1/settings/embedding/test', {
|
||||||
method: 'POST',
|
method: 'POST',
|
||||||
headers: { 'Content-Type': 'application/json' },
|
headers: { 'Content-Type': 'application/json' },
|
||||||
body: JSON.stringify({ provider, openai: { baseUrl, apiKey, model, dimensions } }),
|
body: JSON.stringify({ provider, openai: { baseUrl, apiKey, model, dimensions } })
|
||||||
});
|
});
|
||||||
if (res.ok) {
|
if (res.ok) {
|
||||||
testStatus = 'ok';
|
testStatus = 'ok';
|
||||||
} else {
|
} else {
|
||||||
const data = await res.json();
|
const data = await res.json();
|
||||||
testStatus = 'error';
|
testStatus = 'error';
|
||||||
testError = data.error;
|
testError = data.error;
|
||||||
}
|
}
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
testStatus = 'error';
|
testStatus = 'error';
|
||||||
testError = (e as Error).message;
|
testError = (e as Error).message;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
async function save() {
|
async function save() {
|
||||||
saving = true;
|
saving = true;
|
||||||
await fetch('/api/v1/settings/embedding', {
|
await fetch('/api/v1/settings/embedding', {
|
||||||
method: 'PUT',
|
method: 'PUT',
|
||||||
headers: { 'Content-Type': 'application/json' },
|
headers: { 'Content-Type': 'application/json' },
|
||||||
body: JSON.stringify({ provider, openai: { baseUrl, apiKey, model, dimensions } }),
|
body: JSON.stringify({ provider, openai: { baseUrl, apiKey, model, dimensions } })
|
||||||
});
|
});
|
||||||
saving = false;
|
saving = false;
|
||||||
}
|
}
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<div class="mx-auto max-w-2xl py-8">
|
<div class="mx-auto max-w-2xl py-8">
|
||||||
<h1 class="mb-6 text-2xl font-bold text-gray-900">Settings</h1>
|
<h1 class="mb-6 text-2xl font-bold text-gray-900">Settings</h1>
|
||||||
|
|
||||||
<section class="rounded-xl border border-gray-200 bg-white p-6">
|
<section class="rounded-xl border border-gray-200 bg-white p-6">
|
||||||
<h2 class="mb-1 text-lg font-semibold">Embedding Provider</h2>
|
<h2 class="mb-1 text-lg font-semibold">Embedding Provider</h2>
|
||||||
<p class="mb-4 text-sm text-gray-500">
|
<p class="mb-4 text-sm text-gray-500">
|
||||||
Embeddings enable semantic search. Without them, only keyword search (FTS5) is used.
|
Embeddings enable semantic search. Without them, only keyword search (FTS5) is used.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<div class="mb-4 flex gap-2">
|
<div class="mb-4 flex gap-2">
|
||||||
{#each ['none', 'openai', 'local'] as p}
|
{#each ['none', 'openai', 'local'] as p}
|
||||||
<button
|
<button
|
||||||
onclick={() => provider = p}
|
onclick={() => (provider = p)}
|
||||||
class="rounded-lg px-4 py-2 text-sm {provider === p
|
class="rounded-lg px-4 py-2 text-sm {provider === p
|
||||||
? 'bg-blue-600 text-white'
|
? 'bg-blue-600 text-white'
|
||||||
: 'border border-gray-200 text-gray-700 hover:bg-gray-50'}"
|
: 'border border-gray-200 text-gray-700 hover:bg-gray-50'}"
|
||||||
>
|
>
|
||||||
{p === 'none' ? 'None (FTS5 only)' : p === 'openai' ? 'OpenAI-compatible' : 'Local Model'}
|
{p === 'none' ? 'None (FTS5 only)' : p === 'openai' ? 'OpenAI-compatible' : 'Local Model'}
|
||||||
</button>
|
</button>
|
||||||
{/each}
|
{/each}
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if provider === 'none'}
|
{#if provider === 'none'}
|
||||||
<div class="rounded-lg bg-amber-50 border border-amber-200 p-3 text-sm text-amber-700">
|
<div class="rounded-lg border border-amber-200 bg-amber-50 p-3 text-sm text-amber-700">
|
||||||
Search will use keyword matching only. Results may be less relevant for complex questions.
|
Search will use keyword matching only. Results may be less relevant for complex questions.
|
||||||
</div>
|
</div>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
{#if provider === 'openai'}
|
{#if provider === 'openai'}
|
||||||
<div class="space-y-3">
|
<div class="space-y-3">
|
||||||
<!-- Preset buttons -->
|
<!-- Preset buttons -->
|
||||||
<div class="flex gap-2 flex-wrap">
|
<div class="flex flex-wrap gap-2">
|
||||||
{#each PROVIDER_PRESETS as preset}
|
{#each PROVIDER_PRESETS as preset}
|
||||||
<button
|
<button
|
||||||
onclick={() => { baseUrl = preset.baseUrl; model = preset.model; dimensions = preset.dimensions; }}
|
onclick={() => {
|
||||||
class="rounded border border-gray-200 px-2.5 py-1 text-xs text-gray-600 hover:bg-gray-50"
|
baseUrl = preset.baseUrl;
|
||||||
>
|
model = preset.model;
|
||||||
{preset.name}
|
dimensions = preset.dimensions;
|
||||||
</button>
|
}}
|
||||||
{/each}
|
class="rounded border border-gray-200 px-2.5 py-1 text-xs text-gray-600 hover:bg-gray-50"
|
||||||
</div>
|
>
|
||||||
|
{preset.name}
|
||||||
|
</button>
|
||||||
|
{/each}
|
||||||
|
</div>
|
||||||
|
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium">Base URL</span>
|
<span class="text-sm font-medium">Base URL</span>
|
||||||
<input type="text" bind:value={baseUrl} class="mt-1 w-full rounded-lg border px-3 py-2 text-sm" />
|
<input
|
||||||
</label>
|
type="text"
|
||||||
|
bind:value={baseUrl}
|
||||||
|
class="mt-1 w-full rounded-lg border px-3 py-2 text-sm"
|
||||||
|
/>
|
||||||
|
</label>
|
||||||
|
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium">API Key</span>
|
<span class="text-sm font-medium">API Key</span>
|
||||||
<input type="password" bind:value={apiKey} class="mt-1 w-full rounded-lg border px-3 py-2 text-sm" placeholder="sk-..." />
|
<input
|
||||||
</label>
|
type="password"
|
||||||
|
bind:value={apiKey}
|
||||||
|
class="mt-1 w-full rounded-lg border px-3 py-2 text-sm"
|
||||||
|
placeholder="sk-..."
|
||||||
|
/>
|
||||||
|
</label>
|
||||||
|
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium">Model</span>
|
<span class="text-sm font-medium">Model</span>
|
||||||
<input type="text" bind:value={model} class="mt-1 w-full rounded-lg border px-3 py-2 text-sm" />
|
<input
|
||||||
</label>
|
type="text"
|
||||||
|
bind:value={model}
|
||||||
|
class="mt-1 w-full rounded-lg border px-3 py-2 text-sm"
|
||||||
|
/>
|
||||||
|
</label>
|
||||||
|
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium">Dimensions (optional override)</span>
|
<span class="text-sm font-medium">Dimensions (optional override)</span>
|
||||||
<input type="number" bind:value={dimensions} class="mt-1 w-full rounded-lg border px-3 py-2 text-sm" />
|
<input
|
||||||
</label>
|
type="number"
|
||||||
|
bind:value={dimensions}
|
||||||
|
class="mt-1 w-full rounded-lg border px-3 py-2 text-sm"
|
||||||
|
/>
|
||||||
|
</label>
|
||||||
|
|
||||||
<div class="flex items-center gap-3">
|
<div class="flex items-center gap-3">
|
||||||
<button onclick={testConnection} class="rounded-lg border border-gray-300 px-3 py-1.5 text-sm">
|
<button
|
||||||
{testStatus === 'testing' ? 'Testing...' : 'Test Connection'}
|
onclick={testConnection}
|
||||||
</button>
|
class="rounded-lg border border-gray-300 px-3 py-1.5 text-sm"
|
||||||
{#if testStatus === 'ok'}
|
>
|
||||||
<span class="text-sm text-green-600">✓ Connection successful</span>
|
{testStatus === 'testing' ? 'Testing...' : 'Test Connection'}
|
||||||
{:else if testStatus === 'error'}
|
</button>
|
||||||
<span class="text-sm text-red-600">✗ {testError}</span>
|
{#if testStatus === 'ok'}
|
||||||
{/if}
|
<span class="text-sm text-green-600">✓ Connection successful</span>
|
||||||
</div>
|
{:else if testStatus === 'error'}
|
||||||
</div>
|
<span class="text-sm text-red-600">✗ {testError}</span>
|
||||||
{/if}
|
{/if}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
{/if}
|
||||||
|
|
||||||
<div class="mt-6 flex justify-end">
|
<div class="mt-6 flex justify-end">
|
||||||
<button
|
<button
|
||||||
onclick={save}
|
onclick={save}
|
||||||
disabled={saving}
|
disabled={saving}
|
||||||
class="rounded-lg bg-blue-600 px-4 py-2 text-sm text-white disabled:opacity-50"
|
class="rounded-lg bg-blue-600 px-4 py-2 text-sm text-white disabled:opacity-50"
|
||||||
>
|
>
|
||||||
{saving ? 'Saving...' : 'Save Settings'}
|
{saving ? 'Saving...' : 'Save Settings'}
|
||||||
</button>
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
</div>
|
</div>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -80,6 +80,7 @@ git -C /path/to/repo archive <commit-hash> | tar -x -C /tmp/trueref-idx/<repo>-<
|
|||||||
```
|
```
|
||||||
|
|
||||||
Advantages over `git checkout` or worktrees:
|
Advantages over `git checkout` or worktrees:
|
||||||
|
|
||||||
- Working directory is completely untouched
|
- Working directory is completely untouched
|
||||||
- No `.git` directory in the output (cleaner for parsing)
|
- No `.git` directory in the output (cleaner for parsing)
|
||||||
- Temp directory deleted after indexing with no git state to clean up
|
- Temp directory deleted after indexing with no git state to clean up
|
||||||
@@ -102,26 +103,26 @@ Allow commit hashes to be pinned explicitly per version, overriding tag resoluti
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"previousVersions": [
|
"previousVersions": [
|
||||||
{
|
{
|
||||||
"tag": "v2.0.0",
|
"tag": "v2.0.0",
|
||||||
"title": "Version 2.0.0",
|
"title": "Version 2.0.0",
|
||||||
"commitHash": "a3f9c12abc..."
|
"commitHash": "a3f9c12abc..."
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### Edge Cases
|
### Edge Cases
|
||||||
|
|
||||||
| Case | Handling |
|
| Case | Handling |
|
||||||
|------|----------|
|
| ---------------------------- | ---------------------------------------------------------------------------------- |
|
||||||
| Annotated tags | `rev-parse <tag>^{commit}` peels to commit automatically |
|
| Annotated tags | `rev-parse <tag>^{commit}` peels to commit automatically |
|
||||||
| Mutable tags (e.g. `latest`) | Re-resolve on re-index; warn in UI if hash has changed |
|
| Mutable tags (e.g. `latest`) | Re-resolve on re-index; warn in UI if hash has changed |
|
||||||
| Branch as version | `rev-parse origin/<branch>^{commit}` gives tip; re-resolves on re-index |
|
| Branch as version | `rev-parse origin/<branch>^{commit}` gives tip; re-resolves on re-index |
|
||||||
| Shallow clone | Run `git fetch --unshallow` before `git archive` if commit is unavailable |
|
| Shallow clone | Run `git fetch --unshallow` before `git archive` if commit is unavailable |
|
||||||
| Submodules | `git archive --recurse-submodules` or document as a known limitation |
|
| Submodules | `git archive --recurse-submodules` or document as a known limitation |
|
||||||
| Git LFS | `git lfs pull` required after archive if LFS-tracked files are needed for indexing |
|
| Git LFS | `git lfs pull` required after archive if LFS-tracked files are needed for indexing |
|
||||||
|
|
||||||
### Acceptance Criteria
|
### Acceptance Criteria
|
||||||
|
|
||||||
@@ -169,12 +170,12 @@ fi
|
|||||||
|
|
||||||
Username conventions by server type:
|
Username conventions by server type:
|
||||||
|
|
||||||
| Server | HTTPS username | Password |
|
| Server | HTTPS username | Password |
|
||||||
|--------|---------------|----------|
|
| ------------------------------ | --------------------- | --------------------- |
|
||||||
| Bitbucket Server / Data Center | `x-token-auth` | HTTP access token |
|
| Bitbucket Server / Data Center | `x-token-auth` | HTTP access token |
|
||||||
| Bitbucket Cloud | account username | App password |
|
| Bitbucket Cloud | account username | App password |
|
||||||
| GitLab (self-hosted or cloud) | `oauth2` | Personal access token |
|
| GitLab (self-hosted or cloud) | `oauth2` | Personal access token |
|
||||||
| GitLab deploy token | `gitlab-deploy-token` | Deploy token secret |
|
| GitLab deploy token | `gitlab-deploy-token` | Deploy token secret |
|
||||||
|
|
||||||
SSH authentication is also supported and preferred for long-lived deployments. The host SSH configuration (`~/.ssh/config`) handles per-host key selection and travels into the container via volume mount.
|
SSH authentication is also supported and preferred for long-lived deployments. The host SSH configuration (`~/.ssh/config`) handles per-host key selection and travels into the container via volume mount.
|
||||||
|
|
||||||
@@ -220,7 +221,7 @@ services:
|
|||||||
web:
|
web:
|
||||||
build: .
|
build: .
|
||||||
ports:
|
ports:
|
||||||
- "3000:3000"
|
- '3000:3000'
|
||||||
volumes:
|
volumes:
|
||||||
- trueref-data:/data
|
- trueref-data:/data
|
||||||
- ${USERPROFILE}/.ssh:/root/.ssh:ro
|
- ${USERPROFILE}/.ssh:/root/.ssh:ro
|
||||||
@@ -228,20 +229,20 @@ services:
|
|||||||
- ${CORP_CA_CERT}:/certs/corp-ca.crt:ro
|
- ${CORP_CA_CERT}:/certs/corp-ca.crt:ro
|
||||||
environment:
|
environment:
|
||||||
DATABASE_URL: /data/trueref.db
|
DATABASE_URL: /data/trueref.db
|
||||||
GIT_TOKEN_BITBUCKET: "${BITBUCKET_TOKEN}"
|
GIT_TOKEN_BITBUCKET: '${BITBUCKET_TOKEN}'
|
||||||
GIT_TOKEN_GITLAB: "${GITLAB_TOKEN}"
|
GIT_TOKEN_GITLAB: '${GITLAB_TOKEN}'
|
||||||
BITBUCKET_HOST: "${BITBUCKET_HOST}"
|
BITBUCKET_HOST: '${BITBUCKET_HOST}'
|
||||||
GITLAB_HOST: "${GITLAB_HOST}"
|
GITLAB_HOST: '${GITLAB_HOST}'
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
|
|
||||||
mcp:
|
mcp:
|
||||||
build: .
|
build: .
|
||||||
command: mcp
|
command: mcp
|
||||||
ports:
|
ports:
|
||||||
- "3001:3001"
|
- '3001:3001'
|
||||||
environment:
|
environment:
|
||||||
TRUEREF_API_URL: http://web:3000
|
TRUEREF_API_URL: http://web:3000
|
||||||
MCP_PORT: "3001"
|
MCP_PORT: '3001'
|
||||||
depends_on:
|
depends_on:
|
||||||
- web
|
- web
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
|
|||||||
@@ -107,16 +107,16 @@ An embedding profile is persisted configuration selecting one provider adapter p
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface EmbeddingProfile {
|
interface EmbeddingProfile {
|
||||||
id: string;
|
id: string;
|
||||||
providerKind: string;
|
providerKind: string;
|
||||||
title: string;
|
title: string;
|
||||||
enabled: boolean;
|
enabled: boolean;
|
||||||
isDefault: boolean;
|
isDefault: boolean;
|
||||||
config: Record<string, unknown>;
|
config: Record<string, unknown>;
|
||||||
model: string;
|
model: string;
|
||||||
dimensions: number;
|
dimensions: number;
|
||||||
createdAt: number;
|
createdAt: number;
|
||||||
updatedAt: number;
|
updatedAt: number;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -9,7 +9,11 @@
|
|||||||
import { initializeDatabase } from '$lib/server/db/index.js';
|
import { initializeDatabase } from '$lib/server/db/index.js';
|
||||||
import { getClient } from '$lib/server/db/client.js';
|
import { getClient } from '$lib/server/db/client.js';
|
||||||
import { initializePipeline } from '$lib/server/pipeline/startup.js';
|
import { initializePipeline } from '$lib/server/pipeline/startup.js';
|
||||||
import { EMBEDDING_CONFIG_KEY, createProviderFromConfig, defaultEmbeddingConfig } from '$lib/server/embeddings/factory.js';
|
import {
|
||||||
|
EMBEDDING_CONFIG_KEY,
|
||||||
|
createProviderFromConfig,
|
||||||
|
defaultEmbeddingConfig
|
||||||
|
} from '$lib/server/embeddings/factory.js';
|
||||||
import { EmbeddingService } from '$lib/server/embeddings/embedding.service.js';
|
import { EmbeddingService } from '$lib/server/embeddings/embedding.service.js';
|
||||||
import type { EmbeddingConfig } from '$lib/server/embeddings/factory.js';
|
import type { EmbeddingConfig } from '$lib/server/embeddings/factory.js';
|
||||||
import type { Handle } from '@sveltejs/kit';
|
import type { Handle } from '@sveltejs/kit';
|
||||||
|
|||||||
@@ -115,7 +115,12 @@
|
|||||||
/>
|
/>
|
||||||
{:else}
|
{:else}
|
||||||
<div class="mt-1">
|
<div class="mt-1">
|
||||||
<FolderPicker bind:value={sourceUrl} onselect={(p) => { if (!title) title = p.split('/').at(-1) ?? ''; }} />
|
<FolderPicker
|
||||||
|
bind:value={sourceUrl}
|
||||||
|
onselect={(p) => {
|
||||||
|
if (!title) title = p.split('/').at(-1) ?? '';
|
||||||
|
}}
|
||||||
|
/>
|
||||||
</div>
|
</div>
|
||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
@@ -133,7 +138,8 @@
|
|||||||
{#if source === 'github'}
|
{#if source === 'github'}
|
||||||
<label class="block">
|
<label class="block">
|
||||||
<span class="text-sm font-medium text-gray-700"
|
<span class="text-sm font-medium text-gray-700"
|
||||||
>GitHub Token <span class="font-normal text-gray-500">(optional, for private repos)</span
|
>GitHub Token <span class="font-normal text-gray-500"
|
||||||
|
>(optional, for private repos)</span
|
||||||
></span
|
></span
|
||||||
>
|
>
|
||||||
<input
|
<input
|
||||||
|
|||||||
@@ -78,9 +78,7 @@
|
|||||||
title="Browse folders"
|
title="Browse folders"
|
||||||
>
|
>
|
||||||
<svg class="h-4 w-4" viewBox="0 0 20 20" fill="currentColor">
|
<svg class="h-4 w-4" viewBox="0 0 20 20" fill="currentColor">
|
||||||
<path
|
<path d="M2 6a2 2 0 012-2h5l2 2h5a2 2 0 012 2v6a2 2 0 01-2 2H4a2 2 0 01-2-2V6z" />
|
||||||
d="M2 6a2 2 0 012-2h5l2 2h5a2 2 0 012 2v6a2 2 0 01-2 2H4a2 2 0 01-2-2V6z"
|
|
||||||
/>
|
|
||||||
</svg>
|
</svg>
|
||||||
Browse
|
Browse
|
||||||
</button>
|
</button>
|
||||||
@@ -94,7 +92,10 @@
|
|||||||
class="fixed inset-0 z-[60] flex items-center justify-center bg-black/50 p-4"
|
class="fixed inset-0 z-[60] flex items-center justify-center bg-black/50 p-4"
|
||||||
onclick={handleBackdropClick}
|
onclick={handleBackdropClick}
|
||||||
>
|
>
|
||||||
<div class="flex w-full max-w-lg flex-col rounded-xl bg-white shadow-xl" style="max-height: 70vh">
|
<div
|
||||||
|
class="flex w-full max-w-lg flex-col rounded-xl bg-white shadow-xl"
|
||||||
|
style="max-height: 70vh"
|
||||||
|
>
|
||||||
<!-- Header -->
|
<!-- Header -->
|
||||||
<div class="flex items-center justify-between border-b border-gray-200 px-4 py-3">
|
<div class="flex items-center justify-between border-b border-gray-200 px-4 py-3">
|
||||||
<h3 class="text-sm font-semibold text-gray-900">Select Folder</h3>
|
<h3 class="text-sm font-semibold text-gray-900">Select Folder</h3>
|
||||||
@@ -133,11 +134,7 @@
|
|||||||
{browsePath}
|
{browsePath}
|
||||||
</span>
|
</span>
|
||||||
{#if loading}
|
{#if loading}
|
||||||
<svg
|
<svg class="h-4 w-4 animate-spin text-gray-400" fill="none" viewBox="0 0 24 24">
|
||||||
class="h-4 w-4 animate-spin text-gray-400"
|
|
||||||
fill="none"
|
|
||||||
viewBox="0 0 24 24"
|
|
||||||
>
|
|
||||||
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"
|
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"
|
||||||
></circle>
|
></circle>
|
||||||
<path
|
<path
|
||||||
@@ -166,7 +163,9 @@
|
|||||||
title="Click to navigate, double-click to select"
|
title="Click to navigate, double-click to select"
|
||||||
>
|
>
|
||||||
<svg
|
<svg
|
||||||
class="h-4 w-4 shrink-0 {entry.isGitRepo ? 'text-orange-400' : 'text-yellow-400'}"
|
class="h-4 w-4 shrink-0 {entry.isGitRepo
|
||||||
|
? 'text-orange-400'
|
||||||
|
: 'text-yellow-400'}"
|
||||||
viewBox="0 0 20 20"
|
viewBox="0 0 20 20"
|
||||||
fill="currentColor"
|
fill="currentColor"
|
||||||
>
|
>
|
||||||
@@ -176,7 +175,9 @@
|
|||||||
</svg>
|
</svg>
|
||||||
<span class="flex-1 truncate text-gray-800">{entry.name}</span>
|
<span class="flex-1 truncate text-gray-800">{entry.name}</span>
|
||||||
{#if entry.isGitRepo}
|
{#if entry.isGitRepo}
|
||||||
<span class="shrink-0 rounded bg-orange-100 px-1.5 py-0.5 text-xs text-orange-700">
|
<span
|
||||||
|
class="shrink-0 rounded bg-orange-100 px-1.5 py-0.5 text-xs text-orange-700"
|
||||||
|
>
|
||||||
git
|
git
|
||||||
</span>
|
</span>
|
||||||
{/if}
|
{/if}
|
||||||
|
|||||||
@@ -36,8 +36,9 @@
|
|||||||
<p class="mt-0.5 truncate font-mono text-sm text-gray-500">{repo.id}</p>
|
<p class="mt-0.5 truncate font-mono text-sm text-gray-500">{repo.id}</p>
|
||||||
</div>
|
</div>
|
||||||
<span
|
<span
|
||||||
class="ml-3 shrink-0 rounded-full px-2.5 py-0.5 text-xs font-medium {stateColors[repo.state] ??
|
class="ml-3 shrink-0 rounded-full px-2.5 py-0.5 text-xs font-medium {stateColors[
|
||||||
'bg-gray-100 text-gray-600'}"
|
repo.state
|
||||||
|
] ?? 'bg-gray-100 text-gray-600'}"
|
||||||
>
|
>
|
||||||
{stateLabels[repo.state] ?? repo.state}
|
{stateLabels[repo.state] ?? repo.state}
|
||||||
</span>
|
</span>
|
||||||
|
|||||||
@@ -17,7 +17,10 @@
|
|||||||
};
|
};
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<div class="flex flex-col items-center rounded-lg p-3 {variantClasses[variant] ?? variantClasses.default}">
|
<div
|
||||||
|
class="flex flex-col items-center rounded-lg p-3 {variantClasses[variant] ??
|
||||||
|
variantClasses.default}"
|
||||||
|
>
|
||||||
<span class="text-lg font-bold">{value}</span>
|
<span class="text-lg font-bold">{value}</span>
|
||||||
<span class="mt-0.5 text-xs">{label}</span>
|
<span class="mt-0.5 text-xs">{label}</span>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -17,6 +17,8 @@
|
|||||||
const config = $derived(statusConfig[status]);
|
const config = $derived(statusConfig[status]);
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<span class="inline-flex items-center rounded-full px-2.5 py-0.5 text-xs font-medium {config.bg} {config.text}">
|
<span
|
||||||
|
class="inline-flex items-center rounded-full px-2.5 py-0.5 text-xs font-medium {config.bg} {config.text}"
|
||||||
|
>
|
||||||
{config.label}
|
{config.label}
|
||||||
</span>
|
</span>
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
{placeholder}
|
{placeholder}
|
||||||
onkeydown={handleKeydown}
|
onkeydown={handleKeydown}
|
||||||
disabled={loading}
|
disabled={loading}
|
||||||
class="flex-1 rounded-lg border border-gray-200 bg-white px-4 py-2.5 text-sm text-gray-900 placeholder-gray-400 shadow-sm outline-none transition-all focus:border-blue-400 focus:ring-2 focus:ring-blue-100 disabled:cursor-not-allowed disabled:opacity-60"
|
class="flex-1 rounded-lg border border-gray-200 bg-white px-4 py-2.5 text-sm text-gray-900 placeholder-gray-400 shadow-sm transition-all outline-none focus:border-blue-400 focus:ring-2 focus:ring-blue-100 disabled:cursor-not-allowed disabled:opacity-60"
|
||||||
/>
|
/>
|
||||||
<button
|
<button
|
||||||
onclick={onsubmit}
|
onclick={onsubmit}
|
||||||
|
|||||||
@@ -5,21 +5,15 @@
|
|||||||
|
|
||||||
const isCode = $derived(snippet.type === 'code');
|
const isCode = $derived(snippet.type === 'code');
|
||||||
|
|
||||||
const title = $derived(
|
const title = $derived(snippet.type === 'code' ? snippet.title : null);
|
||||||
snippet.type === 'code' ? snippet.title : null
|
|
||||||
);
|
|
||||||
|
|
||||||
const breadcrumb = $derived(
|
const breadcrumb = $derived(snippet.type === 'code' ? snippet.description : snippet.breadcrumb);
|
||||||
snippet.type === 'code' ? snippet.description : snippet.breadcrumb
|
|
||||||
);
|
|
||||||
|
|
||||||
const content = $derived(
|
const content = $derived(
|
||||||
snippet.type === 'code' ? snippet.codeList[0]?.code ?? '' : snippet.text
|
snippet.type === 'code' ? (snippet.codeList[0]?.code ?? '') : snippet.text
|
||||||
);
|
);
|
||||||
|
|
||||||
const language = $derived(
|
const language = $derived(snippet.type === 'code' ? (snippet.codeList[0]?.language ?? '') : null);
|
||||||
snippet.type === 'code' ? (snippet.codeList[0]?.language ?? '') : null
|
|
||||||
);
|
|
||||||
|
|
||||||
const tokenCount = $derived(snippet.tokenCount ?? 0);
|
const tokenCount = $derived(snippet.tokenCount ?? 0);
|
||||||
</script>
|
</script>
|
||||||
@@ -40,14 +34,14 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if breadcrumb}
|
{#if breadcrumb}
|
||||||
<p class="bg-gray-50 px-4 py-1.5 text-xs italic text-gray-500">{breadcrumb}</p>
|
<p class="bg-gray-50 px-4 py-1.5 text-xs text-gray-500 italic">{breadcrumb}</p>
|
||||||
{/if}
|
{/if}
|
||||||
|
|
||||||
<div class="p-4">
|
<div class="p-4">
|
||||||
{#if isCode}
|
{#if isCode}
|
||||||
<pre
|
<pre class="overflow-x-auto rounded bg-gray-950 p-4 text-sm text-gray-100"><code
|
||||||
class="overflow-x-auto rounded bg-gray-950 p-4 text-sm text-gray-100"
|
>{content}</code
|
||||||
><code>{content}</code></pre>
|
></pre>
|
||||||
{:else}
|
{:else}
|
||||||
<div class="prose prose-sm max-w-none whitespace-pre-wrap text-gray-700">{content}</div>
|
<div class="prose prose-sm max-w-none whitespace-pre-wrap text-gray-700">{content}</div>
|
||||||
{/if}
|
{/if}
|
||||||
|
|||||||
@@ -83,9 +83,7 @@ function makeSnippetResult(snippet: Snippet): SnippetSearchResult {
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
function makeMetadata(
|
function makeMetadata(overrides: Partial<ContextResponseMetadata> = {}): ContextResponseMetadata {
|
||||||
overrides: Partial<ContextResponseMetadata> = {}
|
|
||||||
): ContextResponseMetadata {
|
|
||||||
return {
|
return {
|
||||||
localSource: false,
|
localSource: false,
|
||||||
resultCount: 1,
|
resultCount: 1,
|
||||||
@@ -160,7 +158,11 @@ describe('formatLibrarySearchJson', () => {
|
|||||||
|
|
||||||
it('maps non-indexed state to initial', () => {
|
it('maps non-indexed state to initial', () => {
|
||||||
const results: LibrarySearchResult[] = [
|
const results: LibrarySearchResult[] = [
|
||||||
new LibrarySearchResult({ repository: makeRepo({ state: 'pending' }), versions: [], score: 0 })
|
new LibrarySearchResult({
|
||||||
|
repository: makeRepo({ state: 'pending' }),
|
||||||
|
versions: [],
|
||||||
|
score: 0
|
||||||
|
})
|
||||||
];
|
];
|
||||||
const response = formatLibrarySearchJson(results);
|
const response = formatLibrarySearchJson(results);
|
||||||
expect(response.results[0].state).toBe('initial');
|
expect(response.results[0].state).toBe('initial');
|
||||||
@@ -168,7 +170,11 @@ describe('formatLibrarySearchJson', () => {
|
|||||||
|
|
||||||
it('handles null lastIndexedAt', () => {
|
it('handles null lastIndexedAt', () => {
|
||||||
const results: LibrarySearchResult[] = [
|
const results: LibrarySearchResult[] = [
|
||||||
new LibrarySearchResult({ repository: makeRepo({ lastIndexedAt: null }), versions: [], score: 0 })
|
new LibrarySearchResult({
|
||||||
|
repository: makeRepo({ lastIndexedAt: null }),
|
||||||
|
versions: [],
|
||||||
|
score: 0
|
||||||
|
})
|
||||||
];
|
];
|
||||||
const response = formatLibrarySearchJson(results);
|
const response = formatLibrarySearchJson(results);
|
||||||
expect(response.results[0].lastUpdateDate).toBeNull();
|
expect(response.results[0].lastUpdateDate).toBeNull();
|
||||||
|
|||||||
@@ -66,7 +66,9 @@ export const CORS_HEADERS = {
|
|||||||
/**
|
/**
|
||||||
* Convert internal LibrarySearchResult[] to the context7-compatible JSON body.
|
* Convert internal LibrarySearchResult[] to the context7-compatible JSON body.
|
||||||
*/
|
*/
|
||||||
export function formatLibrarySearchJson(results: LibrarySearchResult[]): LibrarySearchJsonResponseDto {
|
export function formatLibrarySearchJson(
|
||||||
|
results: LibrarySearchResult[]
|
||||||
|
): LibrarySearchJsonResponseDto {
|
||||||
return ContextResponseMapper.toLibrarySearchJson(results);
|
return ContextResponseMapper.toLibrarySearchJson(results);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -80,7 +82,7 @@ export function formatContextJson(
|
|||||||
snippets: SnippetSearchResult[],
|
snippets: SnippetSearchResult[],
|
||||||
rules: string[],
|
rules: string[],
|
||||||
metadata?: ContextResponseMetadata
|
metadata?: ContextResponseMetadata
|
||||||
): ContextJsonResponseDto {
|
): ContextJsonResponseDto {
|
||||||
return ContextResponseMapper.toContextJson(snippets, rules, metadata);
|
return ContextResponseMapper.toContextJson(snippets, rules, metadata);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -94,7 +96,10 @@ export function formatContextJson(
|
|||||||
* @param snippets - Ranked snippet search results (already token-budget trimmed).
|
* @param snippets - Ranked snippet search results (already token-budget trimmed).
|
||||||
* @param rules - Rules from `trueref.json` / `repository_configs`.
|
* @param rules - Rules from `trueref.json` / `repository_configs`.
|
||||||
*/
|
*/
|
||||||
function formatOriginLine(result: SnippetSearchResult, metadata?: ContextResponseMetadata): string | null {
|
function formatOriginLine(
|
||||||
|
result: SnippetSearchResult,
|
||||||
|
metadata?: ContextResponseMetadata
|
||||||
|
): string | null {
|
||||||
if (!metadata?.repository) return null;
|
if (!metadata?.repository) return null;
|
||||||
|
|
||||||
const parts = [
|
const parts = [
|
||||||
|
|||||||
@@ -115,10 +115,7 @@ describe('parseConfigFile — description', () => {
|
|||||||
|
|
||||||
describe('parseConfigFile — array path fields', () => {
|
describe('parseConfigFile — array path fields', () => {
|
||||||
it('accepts valid folders', () => {
|
it('accepts valid folders', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(JSON.stringify({ folders: ['src/', 'docs/'] }), 'trueref.json');
|
||||||
JSON.stringify({ folders: ['src/', 'docs/'] }),
|
|
||||||
'trueref.json'
|
|
||||||
);
|
|
||||||
expect(result.config.folders).toEqual(['src/', 'docs/']);
|
expect(result.config.folders).toEqual(['src/', 'docs/']);
|
||||||
expect(result.warnings).toHaveLength(0);
|
expect(result.warnings).toHaveLength(0);
|
||||||
});
|
});
|
||||||
@@ -130,10 +127,7 @@ describe('parseConfigFile — array path fields', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('skips non-string entries in folders with a warning', () => {
|
it('skips non-string entries in folders with a warning', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(JSON.stringify({ folders: ['src/', 42, true] }), 'trueref.json');
|
||||||
JSON.stringify({ folders: ['src/', 42, true] }),
|
|
||||||
'trueref.json'
|
|
||||||
);
|
|
||||||
expect(result.config.folders).toEqual(['src/']);
|
expect(result.config.folders).toEqual(['src/']);
|
||||||
expect(result.warnings.length).toBeGreaterThan(0);
|
expect(result.warnings.length).toBeGreaterThan(0);
|
||||||
});
|
});
|
||||||
@@ -174,7 +168,9 @@ describe('parseConfigFile — array path fields', () => {
|
|||||||
describe('parseConfigFile — rules', () => {
|
describe('parseConfigFile — rules', () => {
|
||||||
it('accepts valid rules', () => {
|
it('accepts valid rules', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(
|
||||||
JSON.stringify({ rules: ['Always use named imports.', 'Prefer async/await over callbacks.'] }),
|
JSON.stringify({
|
||||||
|
rules: ['Always use named imports.', 'Prefer async/await over callbacks.']
|
||||||
|
}),
|
||||||
'trueref.json'
|
'trueref.json'
|
||||||
);
|
);
|
||||||
expect(result.config.rules).toHaveLength(2);
|
expect(result.config.rules).toHaveLength(2);
|
||||||
@@ -204,10 +200,7 @@ describe('parseConfigFile — rules', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('ignores non-array rules with a warning', () => {
|
it('ignores non-array rules with a warning', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(JSON.stringify({ rules: 'use named imports' }), 'trueref.json');
|
||||||
JSON.stringify({ rules: 'use named imports' }),
|
|
||||||
'trueref.json'
|
|
||||||
);
|
|
||||||
expect(result.config.rules).toBeUndefined();
|
expect(result.config.rules).toBeUndefined();
|
||||||
expect(result.warnings.some((w) => /rules must be an array/.test(w))).toBe(true);
|
expect(result.warnings.some((w) => /rules must be an array/.test(w))).toBe(true);
|
||||||
});
|
});
|
||||||
@@ -243,10 +236,7 @@ describe('parseConfigFile — previousVersions', () => {
|
|||||||
it('skips entries missing tag', () => {
|
it('skips entries missing tag', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(
|
||||||
JSON.stringify({
|
JSON.stringify({
|
||||||
previousVersions: [
|
previousVersions: [{ title: 'No tag here' }, { tag: 'v1.0.0', title: 'Valid' }]
|
||||||
{ title: 'No tag here' },
|
|
||||||
{ tag: 'v1.0.0', title: 'Valid' }
|
|
||||||
]
|
|
||||||
}),
|
}),
|
||||||
'trueref.json'
|
'trueref.json'
|
||||||
);
|
);
|
||||||
@@ -275,10 +265,7 @@ describe('parseConfigFile — previousVersions', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('ignores non-array previousVersions with a warning', () => {
|
it('ignores non-array previousVersions with a warning', () => {
|
||||||
const result = parseConfigFile(
|
const result = parseConfigFile(JSON.stringify({ previousVersions: 'v1.0.0' }), 'trueref.json');
|
||||||
JSON.stringify({ previousVersions: 'v1.0.0' }),
|
|
||||||
'trueref.json'
|
|
||||||
);
|
|
||||||
expect(result.config.previousVersions).toBeUndefined();
|
expect(result.config.previousVersions).toBeUndefined();
|
||||||
expect(result.warnings.some((w) => /previousVersions must be an array/.test(w))).toBe(true);
|
expect(result.warnings.some((w) => /previousVersions must be an array/.test(w))).toBe(true);
|
||||||
});
|
});
|
||||||
@@ -294,9 +281,7 @@ describe('resolveConfig', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('returns null when no matching filenames', () => {
|
it('returns null when no matching filenames', () => {
|
||||||
expect(
|
expect(resolveConfig([{ filename: 'package.json', content: '{"name":"x"}' }])).toBeNull();
|
||||||
resolveConfig([{ filename: 'package.json', content: '{"name":"x"}' }])
|
|
||||||
).toBeNull();
|
|
||||||
});
|
});
|
||||||
|
|
||||||
it('prefers trueref.json over context7.json', () => {
|
it('prefers trueref.json over context7.json', () => {
|
||||||
|
|||||||
@@ -65,7 +65,9 @@ export function parseConfigFile(content: string, filename: string): ParsedConfig
|
|||||||
|
|
||||||
// ---- 2. Root must be an object ------------------------------------------
|
// ---- 2. Root must be an object ------------------------------------------
|
||||||
if (typeof raw !== 'object' || raw === null || Array.isArray(raw)) {
|
if (typeof raw !== 'object' || raw === null || Array.isArray(raw)) {
|
||||||
throw new ConfigParseError(`${filename} must be a JSON object, got ${Array.isArray(raw) ? 'array' : typeof raw}`);
|
throw new ConfigParseError(
|
||||||
|
`${filename} must be a JSON object, got ${Array.isArray(raw) ? 'array' : typeof raw}`
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
const input = raw as Record<string, unknown>;
|
const input = raw as Record<string, unknown>;
|
||||||
@@ -131,7 +133,9 @@ export function parseConfigFile(content: string, filename: string): ParsedConfig
|
|||||||
})
|
})
|
||||||
.map((item) => {
|
.map((item) => {
|
||||||
if (item.length > maxLength) {
|
if (item.length > maxLength) {
|
||||||
warnings.push(`${field} entry truncated to ${maxLength} characters: "${item.slice(0, 40)}..."`);
|
warnings.push(
|
||||||
|
`${field} entry truncated to ${maxLength} characters: "${item.slice(0, 40)}..."`
|
||||||
|
);
|
||||||
return item.slice(0, maxLength);
|
return item.slice(0, maxLength);
|
||||||
}
|
}
|
||||||
return item;
|
return item;
|
||||||
@@ -160,9 +164,7 @@ export function parseConfigFile(content: string, filename: string): ParsedConfig
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
if (r.length < minLength) {
|
if (r.length < minLength) {
|
||||||
warnings.push(
|
warnings.push(`rules entry too short (< ${minLength} chars) — skipping: "${r}"`);
|
||||||
`rules entry too short (< ${minLength} chars) — skipping: "${r}"`
|
|
||||||
);
|
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
|
|||||||
@@ -1,85 +1,85 @@
|
|||||||
{
|
{
|
||||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||||
"$id": "https://trueref.dev/schema/trueref-config.json",
|
"$id": "https://trueref.dev/schema/trueref-config.json",
|
||||||
"title": "TrueRef Repository Configuration",
|
"title": "TrueRef Repository Configuration",
|
||||||
"description": "Configuration file for controlling how a repository is indexed and presented by TrueRef. Place as trueref.json (or context7.json for backward compatibility) at the root of your repository.",
|
"description": "Configuration file for controlling how a repository is indexed and presented by TrueRef. Place as trueref.json (or context7.json for backward compatibility) at the root of your repository.",
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"additionalProperties": false,
|
"additionalProperties": false,
|
||||||
"properties": {
|
"properties": {
|
||||||
"projectTitle": {
|
"projectTitle": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"minLength": 1,
|
"minLength": 1,
|
||||||
"maxLength": 100,
|
"maxLength": 100,
|
||||||
"description": "Override the display name for this library. When set, this replaces the repository name in search results and UI."
|
"description": "Override the display name for this library. When set, this replaces the repository name in search results and UI."
|
||||||
},
|
},
|
||||||
"description": {
|
"description": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"minLength": 10,
|
"minLength": 10,
|
||||||
"maxLength": 500,
|
"maxLength": 500,
|
||||||
"description": "A short description of the library used for search ranking and display. Should accurately describe the library's purpose."
|
"description": "A short description of the library used for search ranking and display. Should accurately describe the library's purpose."
|
||||||
},
|
},
|
||||||
"folders": {
|
"folders": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"maxItems": 50,
|
"maxItems": 50,
|
||||||
"description": "Allowlist of folder path prefixes or regex strings to include in indexing. If empty or absent, all folders are included. Examples: [\"src/\", \"docs/\", \"^packages/core\"]",
|
"description": "Allowlist of folder path prefixes or regex strings to include in indexing. If empty or absent, all folders are included. Examples: [\"src/\", \"docs/\", \"^packages/core\"]",
|
||||||
"items": {
|
"items": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"maxLength": 200,
|
"maxLength": 200,
|
||||||
"description": "A path prefix or regex string. Paths are matched against the full relative file path within the repository."
|
"description": "A path prefix or regex string. Paths are matched against the full relative file path within the repository."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"excludeFolders": {
|
"excludeFolders": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"maxItems": 50,
|
"maxItems": 50,
|
||||||
"description": "Folders to exclude from indexing. Applied after the 'folders' allowlist. Examples: [\"test/\", \"fixtures/\", \"__mocks__\"]",
|
"description": "Folders to exclude from indexing. Applied after the 'folders' allowlist. Examples: [\"test/\", \"fixtures/\", \"__mocks__\"]",
|
||||||
"items": {
|
"items": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"maxLength": 200,
|
"maxLength": 200,
|
||||||
"description": "A path prefix or regex string for folders to exclude."
|
"description": "A path prefix or regex string for folders to exclude."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"excludeFiles": {
|
"excludeFiles": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"maxItems": 100,
|
"maxItems": 100,
|
||||||
"description": "Exact filenames to exclude (no path, no regex). Examples: [\"README.md\", \"CHANGELOG.md\", \"jest.config.ts\"]",
|
"description": "Exact filenames to exclude (no path, no regex). Examples: [\"README.md\", \"CHANGELOG.md\", \"jest.config.ts\"]",
|
||||||
"items": {
|
"items": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"maxLength": 200,
|
"maxLength": 200,
|
||||||
"description": "An exact filename (not a path). Must not contain path separators."
|
"description": "An exact filename (not a path). Must not contain path separators."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"rules": {
|
"rules": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"maxItems": 20,
|
"maxItems": 20,
|
||||||
"description": "Best practices and rules to inject at the top of every query-docs response. These are shown to AI coding assistants to guide correct library usage.",
|
"description": "Best practices and rules to inject at the top of every query-docs response. These are shown to AI coding assistants to guide correct library usage.",
|
||||||
"items": {
|
"items": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"minLength": 5,
|
"minLength": 5,
|
||||||
"maxLength": 500,
|
"maxLength": 500,
|
||||||
"description": "A single best-practice rule or guideline for using this library."
|
"description": "A single best-practice rule or guideline for using this library."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"previousVersions": {
|
"previousVersions": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"maxItems": 50,
|
"maxItems": 50,
|
||||||
"description": "Previously released versions to make available for versioned documentation queries.",
|
"description": "Previously released versions to make available for versioned documentation queries.",
|
||||||
"items": {
|
"items": {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"required": ["tag", "title"],
|
"required": ["tag", "title"],
|
||||||
"additionalProperties": false,
|
"additionalProperties": false,
|
||||||
"properties": {
|
"properties": {
|
||||||
"tag": {
|
"tag": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"pattern": "^v?\\d+\\.\\d+(\\.\\d+)?(-.*)?$",
|
"pattern": "^v?\\d+\\.\\d+(\\.\\d+)?(-.*)?$",
|
||||||
"description": "Git tag name for this version (e.g. \"v1.2.3\", \"2.0.0-beta.1\")."
|
"description": "Git tag name for this version (e.g. \"v1.2.3\", \"2.0.0-beta.1\")."
|
||||||
},
|
},
|
||||||
"title": {
|
"title": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"minLength": 1,
|
"minLength": 1,
|
||||||
"description": "Human-readable version label (e.g. \"Version 1.2.3\", \"v2 Legacy\")."
|
"description": "Human-readable version label (e.g. \"Version 1.2.3\", \"v2 Legacy\")."
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -74,27 +74,46 @@ export const MAX_FILE_SIZE_BYTES = 500_000;
|
|||||||
*/
|
*/
|
||||||
export const IGNORED_DIR_NAMES = new Set([
|
export const IGNORED_DIR_NAMES = new Set([
|
||||||
// ── Version control ────────────────────────────────────────────────────
|
// ── Version control ────────────────────────────────────────────────────
|
||||||
'.git', '.hg', '.svn',
|
'.git',
|
||||||
|
'.hg',
|
||||||
|
'.svn',
|
||||||
|
|
||||||
// ── JavaScript / TypeScript ─────────────────────────────────────────────
|
// ── JavaScript / TypeScript ─────────────────────────────────────────────
|
||||||
'node_modules',
|
'node_modules',
|
||||||
'.npm', '.yarn', '.pnpm-store', '.pnp',
|
'.npm',
|
||||||
|
'.yarn',
|
||||||
|
'.pnpm-store',
|
||||||
|
'.pnp',
|
||||||
// Build outputs and framework caches
|
// Build outputs and framework caches
|
||||||
'dist', 'build', 'out',
|
'dist',
|
||||||
'.next', '.nuxt', '.svelte-kit', '.vite',
|
'build',
|
||||||
'.turbo', '.parcel-cache', '.webpack',
|
'out',
|
||||||
|
'.next',
|
||||||
|
'.nuxt',
|
||||||
|
'.svelte-kit',
|
||||||
|
'.vite',
|
||||||
|
'.turbo',
|
||||||
|
'.parcel-cache',
|
||||||
|
'.webpack',
|
||||||
|
|
||||||
// ── Python ──────────────────────────────────────────────────────────────
|
// ── Python ──────────────────────────────────────────────────────────────
|
||||||
'__pycache__',
|
'__pycache__',
|
||||||
'.venv', 'venv', 'env',
|
'.venv',
|
||||||
'site-packages', '.eggs',
|
'venv',
|
||||||
'.pytest_cache', '.mypy_cache', '.ruff_cache',
|
'env',
|
||||||
'.tox', '.nox',
|
'site-packages',
|
||||||
|
'.eggs',
|
||||||
|
'.pytest_cache',
|
||||||
|
'.mypy_cache',
|
||||||
|
'.ruff_cache',
|
||||||
|
'.tox',
|
||||||
|
'.nox',
|
||||||
'htmlcov',
|
'htmlcov',
|
||||||
|
|
||||||
// ── Java / Kotlin / Scala ───────────────────────────────────────────────
|
// ── Java / Kotlin / Scala ───────────────────────────────────────────────
|
||||||
'target', // Maven + sbt
|
'target', // Maven + sbt
|
||||||
'.gradle', '.mvn',
|
'.gradle',
|
||||||
|
'.mvn',
|
||||||
|
|
||||||
// ── Ruby ────────────────────────────────────────────────────────────────
|
// ── Ruby ────────────────────────────────────────────────────────────────
|
||||||
'.bundle',
|
'.bundle',
|
||||||
@@ -103,19 +122,24 @@ export const IGNORED_DIR_NAMES = new Set([
|
|||||||
// 'vendor' below covers PHP Composer
|
// 'vendor' below covers PHP Composer
|
||||||
|
|
||||||
// ── .NET ────────────────────────────────────────────────────────────────
|
// ── .NET ────────────────────────────────────────────────────────────────
|
||||||
'bin', 'obj', 'packages',
|
'bin',
|
||||||
|
'obj',
|
||||||
|
'packages',
|
||||||
|
|
||||||
// ── Haskell ─────────────────────────────────────────────────────────────
|
// ── Haskell ─────────────────────────────────────────────────────────────
|
||||||
'.stack-work', 'dist-newstyle',
|
'.stack-work',
|
||||||
|
'dist-newstyle',
|
||||||
|
|
||||||
// ── Dart / Flutter ──────────────────────────────────────────────────────
|
// ── Dart / Flutter ──────────────────────────────────────────────────────
|
||||||
'.dart_tool',
|
'.dart_tool',
|
||||||
|
|
||||||
// ── Swift / iOS ─────────────────────────────────────────────────────────
|
// ── Swift / iOS ─────────────────────────────────────────────────────────
|
||||||
'Pods', 'DerivedData',
|
'Pods',
|
||||||
|
'DerivedData',
|
||||||
|
|
||||||
// ── Elixir / Erlang ─────────────────────────────────────────────────────
|
// ── Elixir / Erlang ─────────────────────────────────────────────────────
|
||||||
'_build', 'deps',
|
'_build',
|
||||||
|
'deps',
|
||||||
|
|
||||||
// ── Clojure ─────────────────────────────────────────────────────────────
|
// ── Clojure ─────────────────────────────────────────────────────────────
|
||||||
'.cpcache',
|
'.cpcache',
|
||||||
@@ -125,16 +149,25 @@ export const IGNORED_DIR_NAMES = new Set([
|
|||||||
'vendor',
|
'vendor',
|
||||||
|
|
||||||
// ── Generic caches / temp ───────────────────────────────────────────────
|
// ── Generic caches / temp ───────────────────────────────────────────────
|
||||||
'.cache', '.tmp', 'tmp', 'temp', '.temp', '.sass-cache',
|
'.cache',
|
||||||
|
'.tmp',
|
||||||
|
'tmp',
|
||||||
|
'temp',
|
||||||
|
'.temp',
|
||||||
|
'.sass-cache',
|
||||||
|
|
||||||
// ── Test coverage ───────────────────────────────────────────────────────
|
// ── Test coverage ───────────────────────────────────────────────────────
|
||||||
'coverage', '.nyc_output',
|
'coverage',
|
||||||
|
'.nyc_output',
|
||||||
|
|
||||||
// ── IDE / editor artefacts ──────────────────────────────────────────────
|
// ── IDE / editor artefacts ──────────────────────────────────────────────
|
||||||
'.idea', '.vs',
|
'.idea',
|
||||||
|
'.vs',
|
||||||
|
|
||||||
// ── Generated code ──────────────────────────────────────────────────────
|
// ── Generated code ──────────────────────────────────────────────────────
|
||||||
'generated', '__generated__', '_generated',
|
'generated',
|
||||||
|
'__generated__',
|
||||||
|
'_generated',
|
||||||
|
|
||||||
// ── Logs ────────────────────────────────────────────────────────────────
|
// ── Logs ────────────────────────────────────────────────────────────────
|
||||||
'logs'
|
'logs'
|
||||||
@@ -264,11 +297,7 @@ export function detectLanguage(filePath: string): string {
|
|||||||
* 7. Must not be under a config.excludeFolders path / regex.
|
* 7. Must not be under a config.excludeFolders path / regex.
|
||||||
* 8. Must be under a config.folders allowlist path / regex (if specified).
|
* 8. Must be under a config.folders allowlist path / regex (if specified).
|
||||||
*/
|
*/
|
||||||
export function shouldIndexFile(
|
export function shouldIndexFile(filePath: string, fileSize: number, config?: RepoConfig): boolean {
|
||||||
filePath: string,
|
|
||||||
fileSize: number,
|
|
||||||
config?: RepoConfig
|
|
||||||
): boolean {
|
|
||||||
const ext = extname(filePath).toLowerCase();
|
const ext = extname(filePath).toLowerCase();
|
||||||
const base = basename(filePath);
|
const base = basename(filePath);
|
||||||
|
|
||||||
|
|||||||
@@ -35,10 +35,9 @@ export async function listGitHubTags(
|
|||||||
};
|
};
|
||||||
if (token) headers['Authorization'] = `Bearer ${token}`;
|
if (token) headers['Authorization'] = `Bearer ${token}`;
|
||||||
|
|
||||||
const response = await fetch(
|
const response = await fetch(`https://api.github.com/repos/${owner}/${repo}/tags?per_page=100`, {
|
||||||
`https://api.github.com/repos/${owner}/${repo}/tags?per_page=100`,
|
headers
|
||||||
{ headers }
|
});
|
||||||
);
|
|
||||||
|
|
||||||
if (!response.ok) throw new GitHubApiError(response.status);
|
if (!response.ok) throw new GitHubApiError(response.status);
|
||||||
return response.json() as Promise<GitHubTag[]>;
|
return response.json() as Promise<GitHubTag[]>;
|
||||||
|
|||||||
@@ -8,13 +8,14 @@
|
|||||||
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
|
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
|
||||||
|
|
||||||
import { crawl } from './github.crawler.js';
|
import { crawl } from './github.crawler.js';
|
||||||
import { shouldIndexFile, detectLanguage, INDEXABLE_EXTENSIONS, MAX_FILE_SIZE_BYTES } from './file-filter.js';
|
|
||||||
import { GitHubRateLimiter, Semaphore, withRetry } from './rate-limiter.js';
|
|
||||||
import {
|
import {
|
||||||
AuthenticationError,
|
shouldIndexFile,
|
||||||
PermissionError,
|
detectLanguage,
|
||||||
RepositoryNotFoundError
|
INDEXABLE_EXTENSIONS,
|
||||||
} from './types.js';
|
MAX_FILE_SIZE_BYTES
|
||||||
|
} from './file-filter.js';
|
||||||
|
import { GitHubRateLimiter, Semaphore, withRetry } from './rate-limiter.js';
|
||||||
|
import { AuthenticationError, PermissionError, RepositoryNotFoundError } from './types.js';
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
// Mock fetch helpers
|
// Mock fetch helpers
|
||||||
@@ -112,7 +113,9 @@ describe('shouldIndexFile()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('respects config.excludeFolders prefix', () => {
|
it('respects config.excludeFolders prefix', () => {
|
||||||
expect(shouldIndexFile('internal/config.ts', 100, { excludeFolders: ['internal/'] })).toBe(false);
|
expect(shouldIndexFile('internal/config.ts', 100, { excludeFolders: ['internal/'] })).toBe(
|
||||||
|
false
|
||||||
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
it('allows files outside of config.excludeFolders', () => {
|
it('allows files outside of config.excludeFolders', () => {
|
||||||
@@ -169,8 +172,10 @@ describe('detectLanguage()', () => {
|
|||||||
it('detects markdown', () => expect(detectLanguage('README.md')).toBe('markdown'));
|
it('detects markdown', () => expect(detectLanguage('README.md')).toBe('markdown'));
|
||||||
it('detects svelte', () => expect(detectLanguage('App.svelte')).toBe('svelte'));
|
it('detects svelte', () => expect(detectLanguage('App.svelte')).toBe('svelte'));
|
||||||
it('detects yaml', () => expect(detectLanguage('config.yaml')).toBe('yaml'));
|
it('detects yaml', () => expect(detectLanguage('config.yaml')).toBe('yaml'));
|
||||||
it('returns empty string for unknown extension', () => expect(detectLanguage('file.xyz')).toBe(''));
|
it('returns empty string for unknown extension', () =>
|
||||||
it('is case-insensitive for extensions', () => expect(detectLanguage('FILE.TS')).toBe('typescript'));
|
expect(detectLanguage('file.xyz')).toBe(''));
|
||||||
|
it('is case-insensitive for extensions', () =>
|
||||||
|
expect(detectLanguage('FILE.TS')).toBe('typescript'));
|
||||||
});
|
});
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
@@ -267,9 +272,9 @@ describe('withRetry()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws after exhausting all attempts', async () => {
|
it('throws after exhausting all attempts', async () => {
|
||||||
await expect(
|
await expect(withRetry(() => Promise.reject(new Error('always fails')), 3)).rejects.toThrow(
|
||||||
withRetry(() => Promise.reject(new Error('always fails')), 3)
|
'always fails'
|
||||||
).rejects.toThrow('always fails');
|
);
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -425,14 +430,15 @@ describe('crawl()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws AuthenticationError on 401', async () => {
|
it('throws AuthenticationError on 401', async () => {
|
||||||
stubFetch(() =>
|
stubFetch(
|
||||||
new Response('Unauthorized', {
|
() =>
|
||||||
status: 401,
|
new Response('Unauthorized', {
|
||||||
headers: {
|
status: 401,
|
||||||
'X-RateLimit-Remaining': '0',
|
headers: {
|
||||||
'X-RateLimit-Reset': String(Math.floor(Date.now() / 1000) + 3600)
|
'X-RateLimit-Remaining': '0',
|
||||||
}
|
'X-RateLimit-Reset': String(Math.floor(Date.now() / 1000) + 3600)
|
||||||
})
|
}
|
||||||
|
})
|
||||||
);
|
);
|
||||||
|
|
||||||
await expect(crawl({ owner: 'owner', repo: 'repo', token: 'bad-token' })).rejects.toThrow(
|
await expect(crawl({ owner: 'owner', repo: 'repo', token: 'bad-token' })).rejects.toThrow(
|
||||||
@@ -441,14 +447,15 @@ describe('crawl()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws PermissionError on 403 without rate-limit exhaustion', async () => {
|
it('throws PermissionError on 403 without rate-limit exhaustion', async () => {
|
||||||
stubFetch(() =>
|
stubFetch(
|
||||||
new Response('Forbidden', {
|
() =>
|
||||||
status: 403,
|
new Response('Forbidden', {
|
||||||
headers: {
|
status: 403,
|
||||||
'X-RateLimit-Remaining': '100',
|
headers: {
|
||||||
'X-RateLimit-Reset': String(Math.floor(Date.now() / 1000) + 3600)
|
'X-RateLimit-Remaining': '100',
|
||||||
}
|
'X-RateLimit-Reset': String(Math.floor(Date.now() / 1000) + 3600)
|
||||||
})
|
}
|
||||||
|
})
|
||||||
);
|
);
|
||||||
|
|
||||||
await expect(crawl({ owner: 'owner', repo: 'repo' })).rejects.toThrow(PermissionError);
|
await expect(crawl({ owner: 'owner', repo: 'repo' })).rejects.toThrow(PermissionError);
|
||||||
|
|||||||
@@ -106,9 +106,7 @@ async function throwForStatus(response: Response, rateLimiter: GitHubRateLimiter
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
case 404:
|
case 404:
|
||||||
throw new RepositoryNotFoundError(
|
throw new RepositoryNotFoundError(`Repository not found or not accessible: ${response.url}`);
|
||||||
`Repository not found or not accessible: ${response.url}`
|
|
||||||
);
|
|
||||||
default: {
|
default: {
|
||||||
const body = await response.text().catch(() => '');
|
const body = await response.text().catch(() => '');
|
||||||
throw new Error(`GitHub API error ${response.status}: ${body}`);
|
throw new Error(`GitHub API error ${response.status}: ${body}`);
|
||||||
@@ -129,18 +127,22 @@ async function fetchRepoInfo(
|
|||||||
token: string | undefined,
|
token: string | undefined,
|
||||||
rateLimiter: GitHubRateLimiter
|
rateLimiter: GitHubRateLimiter
|
||||||
): Promise<GitHubRepoResponse> {
|
): Promise<GitHubRepoResponse> {
|
||||||
return withRetry(async () => {
|
return withRetry(
|
||||||
await rateLimiter.waitIfNeeded();
|
async () => {
|
||||||
|
await rateLimiter.waitIfNeeded();
|
||||||
|
|
||||||
const response = await fetch(`${GITHUB_API}/repos/${owner}/${repo}`, {
|
const response = await fetch(`${GITHUB_API}/repos/${owner}/${repo}`, {
|
||||||
headers: buildHeaders(token)
|
headers: buildHeaders(token)
|
||||||
});
|
});
|
||||||
|
|
||||||
rateLimiter.updateFromHeaders(response.headers);
|
rateLimiter.updateFromHeaders(response.headers);
|
||||||
await throwForStatus(response, rateLimiter);
|
await throwForStatus(response, rateLimiter);
|
||||||
|
|
||||||
return (await response.json()) as GitHubRepoResponse;
|
return (await response.json()) as GitHubRepoResponse;
|
||||||
}, 3, isRetryable);
|
},
|
||||||
|
3,
|
||||||
|
isRetryable
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -155,21 +157,25 @@ async function fetchTree(
|
|||||||
token: string | undefined,
|
token: string | undefined,
|
||||||
rateLimiter: GitHubRateLimiter
|
rateLimiter: GitHubRateLimiter
|
||||||
): Promise<GitHubTreeResponse | null> {
|
): Promise<GitHubTreeResponse | null> {
|
||||||
return withRetry(async () => {
|
return withRetry(
|
||||||
await rateLimiter.waitIfNeeded();
|
async () => {
|
||||||
|
await rateLimiter.waitIfNeeded();
|
||||||
|
|
||||||
const url = `${GITHUB_API}/repos/${owner}/${repo}/git/trees/${ref}?recursive=1`;
|
const url = `${GITHUB_API}/repos/${owner}/${repo}/git/trees/${ref}?recursive=1`;
|
||||||
const response = await fetch(url, { headers: buildHeaders(token) });
|
const response = await fetch(url, { headers: buildHeaders(token) });
|
||||||
|
|
||||||
rateLimiter.updateFromHeaders(response.headers);
|
rateLimiter.updateFromHeaders(response.headers);
|
||||||
|
|
||||||
// 422 means the tree is too large for a single recursive call.
|
// 422 means the tree is too large for a single recursive call.
|
||||||
if (response.status === 422) return null;
|
if (response.status === 422) return null;
|
||||||
|
|
||||||
await throwForStatus(response, rateLimiter);
|
await throwForStatus(response, rateLimiter);
|
||||||
|
|
||||||
return (await response.json()) as GitHubTreeResponse;
|
return (await response.json()) as GitHubTreeResponse;
|
||||||
}, 3, isRetryable);
|
},
|
||||||
|
3,
|
||||||
|
isRetryable
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -184,17 +190,21 @@ async function fetchSubTree(
|
|||||||
token: string | undefined,
|
token: string | undefined,
|
||||||
rateLimiter: GitHubRateLimiter
|
rateLimiter: GitHubRateLimiter
|
||||||
): Promise<GitHubTreeResponse> {
|
): Promise<GitHubTreeResponse> {
|
||||||
return withRetry(async () => {
|
return withRetry(
|
||||||
await rateLimiter.waitIfNeeded();
|
async () => {
|
||||||
|
await rateLimiter.waitIfNeeded();
|
||||||
|
|
||||||
const url = `${GITHUB_API}/repos/${owner}/${repo}/git/trees/${treeSha}`;
|
const url = `${GITHUB_API}/repos/${owner}/${repo}/git/trees/${treeSha}`;
|
||||||
const response = await fetch(url, { headers: buildHeaders(token) });
|
const response = await fetch(url, { headers: buildHeaders(token) });
|
||||||
|
|
||||||
rateLimiter.updateFromHeaders(response.headers);
|
rateLimiter.updateFromHeaders(response.headers);
|
||||||
await throwForStatus(response, rateLimiter);
|
await throwForStatus(response, rateLimiter);
|
||||||
|
|
||||||
return (await response.json()) as GitHubTreeResponse;
|
return (await response.json()) as GitHubTreeResponse;
|
||||||
}, 3, isRetryable);
|
},
|
||||||
|
3,
|
||||||
|
isRetryable
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -208,21 +218,25 @@ async function fetchCommitSha(
|
|||||||
token: string | undefined,
|
token: string | undefined,
|
||||||
rateLimiter: GitHubRateLimiter
|
rateLimiter: GitHubRateLimiter
|
||||||
): Promise<string> {
|
): Promise<string> {
|
||||||
return withRetry(async () => {
|
return withRetry(
|
||||||
await rateLimiter.waitIfNeeded();
|
async () => {
|
||||||
|
await rateLimiter.waitIfNeeded();
|
||||||
|
|
||||||
const url = `${GITHUB_API}/repos/${owner}/${repo}/commits/${ref}`;
|
const url = `${GITHUB_API}/repos/${owner}/${repo}/commits/${ref}`;
|
||||||
const response = await fetch(url, {
|
const response = await fetch(url, {
|
||||||
headers: { ...buildHeaders(token), Accept: 'application/vnd.github.sha' }
|
headers: { ...buildHeaders(token), Accept: 'application/vnd.github.sha' }
|
||||||
});
|
});
|
||||||
|
|
||||||
rateLimiter.updateFromHeaders(response.headers);
|
rateLimiter.updateFromHeaders(response.headers);
|
||||||
await throwForStatus(response, rateLimiter);
|
await throwForStatus(response, rateLimiter);
|
||||||
|
|
||||||
// When Accept is 'application/vnd.github.sha', the response body is the
|
// When Accept is 'application/vnd.github.sha', the response body is the
|
||||||
// bare SHA string.
|
// bare SHA string.
|
||||||
return (await response.text()).trim();
|
return (await response.text()).trim();
|
||||||
}, 3, isRetryable);
|
},
|
||||||
|
3,
|
||||||
|
isRetryable
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -347,14 +361,7 @@ async function fetchRepoConfig(
|
|||||||
|
|
||||||
const content =
|
const content =
|
||||||
(await downloadRawFile(owner, repo, ref, configItem.path, token)) ??
|
(await downloadRawFile(owner, repo, ref, configItem.path, token)) ??
|
||||||
(await downloadViaContentsApi(
|
(await downloadViaContentsApi(owner, repo, ref, configItem.path, token, rateLimiter));
|
||||||
owner,
|
|
||||||
repo,
|
|
||||||
ref,
|
|
||||||
configItem.path,
|
|
||||||
token,
|
|
||||||
rateLimiter
|
|
||||||
));
|
|
||||||
|
|
||||||
if (!content) return undefined;
|
if (!content) return undefined;
|
||||||
|
|
||||||
@@ -435,14 +442,7 @@ export async function crawl(options: CrawlOptions): Promise<CrawlResult> {
|
|||||||
// Prefer raw download (cheaper on rate limit); fall back to API.
|
// Prefer raw download (cheaper on rate limit); fall back to API.
|
||||||
const content =
|
const content =
|
||||||
(await downloadRawFile(owner, repo, ref!, item.path, token)) ??
|
(await downloadRawFile(owner, repo, ref!, item.path, token)) ??
|
||||||
(await downloadViaContentsApi(
|
(await downloadViaContentsApi(owner, repo, ref!, item.path, token, rateLimiter));
|
||||||
owner,
|
|
||||||
repo,
|
|
||||||
ref!,
|
|
||||||
item.path,
|
|
||||||
token,
|
|
||||||
rateLimiter
|
|
||||||
));
|
|
||||||
|
|
||||||
if (content === null) {
|
if (content === null) {
|
||||||
console.warn(`[GitHubCrawler] Could not download: ${item.path} — skipping.`);
|
console.warn(`[GitHubCrawler] Could not download: ${item.path} — skipping.`);
|
||||||
|
|||||||
@@ -52,7 +52,9 @@ async function cleanupTempRepo(root: string): Promise<void> {
|
|||||||
let root: string = '';
|
let root: string = '';
|
||||||
const crawler = new LocalCrawler();
|
const crawler = new LocalCrawler();
|
||||||
|
|
||||||
async function crawlRoot(opts: Partial<LocalCrawlOptions> = {}): Promise<ReturnType<LocalCrawler['crawl']>> {
|
async function crawlRoot(
|
||||||
|
opts: Partial<LocalCrawlOptions> = {}
|
||||||
|
): Promise<ReturnType<LocalCrawler['crawl']>> {
|
||||||
return crawler.crawl({ rootPath: root, ...opts });
|
return crawler.crawl({ rootPath: root, ...opts });
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -141,7 +141,12 @@ export class LocalCrawler {
|
|||||||
});
|
});
|
||||||
|
|
||||||
// Crawl the worktree and stamp the result with the git-resolved metadata.
|
// Crawl the worktree and stamp the result with the git-resolved metadata.
|
||||||
const result = await this.crawlDirectory(worktreePath, options.config, options.onProgress, ref);
|
const result = await this.crawlDirectory(
|
||||||
|
worktreePath,
|
||||||
|
options.config,
|
||||||
|
options.onProgress,
|
||||||
|
ref
|
||||||
|
);
|
||||||
|
|
||||||
return { ...result, commitSha };
|
return { ...result, commitSha };
|
||||||
} finally {
|
} finally {
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -1,27 +1,27 @@
|
|||||||
{
|
{
|
||||||
"version": "7",
|
"version": "7",
|
||||||
"dialect": "sqlite",
|
"dialect": "sqlite",
|
||||||
"entries": [
|
"entries": [
|
||||||
{
|
{
|
||||||
"idx": 0,
|
"idx": 0,
|
||||||
"version": "6",
|
"version": "6",
|
||||||
"when": 1774196053634,
|
"when": 1774196053634,
|
||||||
"tag": "0000_large_master_chief",
|
"tag": "0000_large_master_chief",
|
||||||
"breakpoints": true
|
"breakpoints": true
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"idx": 1,
|
"idx": 1,
|
||||||
"version": "6",
|
"version": "6",
|
||||||
"when": 1774448049161,
|
"when": 1774448049161,
|
||||||
"tag": "0001_quick_nighthawk",
|
"tag": "0001_quick_nighthawk",
|
||||||
"breakpoints": true
|
"breakpoints": true
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"idx": 2,
|
"idx": 2,
|
||||||
"version": "6",
|
"version": "6",
|
||||||
"when": 1774461897742,
|
"when": 1774461897742,
|
||||||
"tag": "0002_silky_stellaris",
|
"tag": "0002_silky_stellaris",
|
||||||
"breakpoints": true
|
"breakpoints": true
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
@@ -381,7 +381,12 @@ describe('EmbeddingService', () => {
|
|||||||
.all(snippetId, 'local-default');
|
.all(snippetId, 'local-default');
|
||||||
expect(rows).toHaveLength(1);
|
expect(rows).toHaveLength(1);
|
||||||
|
|
||||||
const row = rows[0] as { model: string; dimensions: number; embedding: Buffer; profile_id: string };
|
const row = rows[0] as {
|
||||||
|
model: string;
|
||||||
|
dimensions: number;
|
||||||
|
embedding: Buffer;
|
||||||
|
profile_id: string;
|
||||||
|
};
|
||||||
expect(row.model).toBe('test-model');
|
expect(row.model).toBe('test-model');
|
||||||
expect(row.dimensions).toBe(4);
|
expect(row.dimensions).toBe(4);
|
||||||
expect(row.profile_id).toBe('local-default');
|
expect(row.profile_id).toBe('local-default');
|
||||||
@@ -494,9 +499,7 @@ describe('createProviderFromConfig', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws when openai provider is selected without config', () => {
|
it('throws when openai provider is selected without config', () => {
|
||||||
expect(() =>
|
expect(() => createProviderFromConfig({ provider: 'openai' } as EmbeddingConfig)).toThrow();
|
||||||
createProviderFromConfig({ provider: 'openai' } as EmbeddingConfig)
|
|
||||||
).toThrow();
|
|
||||||
});
|
});
|
||||||
|
|
||||||
it('defaultEmbeddingConfig returns provider=none', () => {
|
it('defaultEmbeddingConfig returns provider=none', () => {
|
||||||
|
|||||||
@@ -41,18 +41,16 @@ export class EmbeddingService {
|
|||||||
|
|
||||||
const placeholders = snippetIds.map(() => '?').join(',');
|
const placeholders = snippetIds.map(() => '?').join(',');
|
||||||
const snippets = this.db
|
const snippets = this.db
|
||||||
.prepare<string[], SnippetRow>(
|
.prepare<
|
||||||
`SELECT id, title, breadcrumb, content FROM snippets WHERE id IN (${placeholders})`
|
string[],
|
||||||
)
|
SnippetRow
|
||||||
|
>(`SELECT id, title, breadcrumb, content FROM snippets WHERE id IN (${placeholders})`)
|
||||||
.all(...snippetIds);
|
.all(...snippetIds);
|
||||||
|
|
||||||
if (snippets.length === 0) return;
|
if (snippets.length === 0) return;
|
||||||
|
|
||||||
const texts = snippets.map((s) =>
|
const texts = snippets.map((s) =>
|
||||||
[s.title, s.breadcrumb, s.content]
|
[s.title, s.breadcrumb, s.content].filter(Boolean).join('\n').slice(0, TEXT_MAX_CHARS)
|
||||||
.filter(Boolean)
|
|
||||||
.join('\n')
|
|
||||||
.slice(0, TEXT_MAX_CHARS)
|
|
||||||
);
|
);
|
||||||
|
|
||||||
const insert = this.db.prepare<[string, string, string, number, Buffer]>(`
|
const insert = this.db.prepare<[string, string, string, number, Buffer]>(`
|
||||||
@@ -94,9 +92,10 @@ export class EmbeddingService {
|
|||||||
*/
|
*/
|
||||||
getEmbedding(snippetId: string, profileId: string = 'local-default'): Float32Array | null {
|
getEmbedding(snippetId: string, profileId: string = 'local-default'): Float32Array | null {
|
||||||
const row = this.db
|
const row = this.db
|
||||||
.prepare<[string, string], { embedding: Buffer; dimensions: number }>(
|
.prepare<
|
||||||
`SELECT embedding, dimensions FROM snippet_embeddings WHERE snippet_id = ? AND profile_id = ?`
|
[string, string],
|
||||||
)
|
{ embedding: Buffer; dimensions: number }
|
||||||
|
>(`SELECT embedding, dimensions FROM snippet_embeddings WHERE snippet_id = ? AND profile_id = ?`)
|
||||||
.get(snippetId, profileId);
|
.get(snippetId, profileId);
|
||||||
|
|
||||||
if (!row) return null;
|
if (!row) return null;
|
||||||
|
|||||||
@@ -12,7 +12,11 @@ import { OpenAIEmbeddingProvider } from './openai.provider.js';
|
|||||||
import { LocalEmbeddingProvider } from './local.provider.js';
|
import { LocalEmbeddingProvider } from './local.provider.js';
|
||||||
|
|
||||||
// Re-export registry functions for new callers
|
// Re-export registry functions for new callers
|
||||||
export { createProviderFromProfile, getDefaultLocalProfile, getRegisteredProviderKinds } from './registry.js';
|
export {
|
||||||
|
createProviderFromProfile,
|
||||||
|
getDefaultLocalProfile,
|
||||||
|
getRegisteredProviderKinds
|
||||||
|
} from './registry.js';
|
||||||
|
|
||||||
export interface EmbeddingConfig {
|
export interface EmbeddingConfig {
|
||||||
provider: 'openai' | 'local' | 'none';
|
provider: 'openai' | 'local' | 'none';
|
||||||
|
|||||||
@@ -43,7 +43,12 @@ export class ContextResponseMapper {
|
|||||||
lastUpdateDate: repository.lastIndexedAt
|
lastUpdateDate: repository.lastIndexedAt
|
||||||
? repository.lastIndexedAt.toISOString()
|
? repository.lastIndexedAt.toISOString()
|
||||||
: null,
|
: null,
|
||||||
state: repository.state === 'indexed' ? 'finalized' : repository.state === 'error' ? 'error' : 'initial',
|
state:
|
||||||
|
repository.state === 'indexed'
|
||||||
|
? 'finalized'
|
||||||
|
: repository.state === 'error'
|
||||||
|
? 'error'
|
||||||
|
: 'initial',
|
||||||
totalTokens: repository.totalTokens ?? null,
|
totalTokens: repository.totalTokens ?? null,
|
||||||
totalSnippets: repository.totalSnippets ?? null,
|
totalSnippets: repository.totalSnippets ?? null,
|
||||||
stars: repository.stars ?? null,
|
stars: repository.stars ?? null,
|
||||||
@@ -64,14 +69,16 @@ export class ContextResponseMapper {
|
|||||||
const mapped: SnippetJsonDto[] = snippets.map(({ snippet }) => {
|
const mapped: SnippetJsonDto[] = snippets.map(({ snippet }) => {
|
||||||
const origin = metadata?.repository
|
const origin = metadata?.repository
|
||||||
? new SnippetOriginJsonDto({
|
? new SnippetOriginJsonDto({
|
||||||
repositoryId: snippet.repositoryId,
|
repositoryId: snippet.repositoryId,
|
||||||
repositoryTitle: metadata.repository.title,
|
repositoryTitle: metadata.repository.title,
|
||||||
source: metadata.repository.source,
|
source: metadata.repository.source,
|
||||||
sourceUrl: metadata.repository.sourceUrl,
|
sourceUrl: metadata.repository.sourceUrl,
|
||||||
version: snippet.versionId ? metadata.snippetVersions[snippet.versionId] ?? null : null,
|
version: snippet.versionId
|
||||||
versionId: snippet.versionId,
|
? (metadata.snippetVersions[snippet.versionId] ?? null)
|
||||||
isLocal: metadata.localSource
|
: null,
|
||||||
})
|
versionId: snippet.versionId,
|
||||||
|
isLocal: metadata.localSource
|
||||||
|
})
|
||||||
: null;
|
: null;
|
||||||
|
|
||||||
if (snippet.type === 'code') {
|
if (snippet.type === 'code') {
|
||||||
@@ -108,20 +115,20 @@ export class ContextResponseMapper {
|
|||||||
localSource: metadata?.localSource ?? false,
|
localSource: metadata?.localSource ?? false,
|
||||||
repository: metadata?.repository
|
repository: metadata?.repository
|
||||||
? new ContextRepositoryJsonDto({
|
? new ContextRepositoryJsonDto({
|
||||||
id: metadata.repository.id,
|
id: metadata.repository.id,
|
||||||
title: metadata.repository.title,
|
title: metadata.repository.title,
|
||||||
source: metadata.repository.source,
|
source: metadata.repository.source,
|
||||||
sourceUrl: metadata.repository.sourceUrl,
|
sourceUrl: metadata.repository.sourceUrl,
|
||||||
branch: metadata.repository.branch,
|
branch: metadata.repository.branch,
|
||||||
isLocal: metadata.localSource
|
isLocal: metadata.localSource
|
||||||
})
|
})
|
||||||
: null,
|
: null,
|
||||||
version: metadata?.version
|
version: metadata?.version
|
||||||
? new ContextVersionJsonDto({
|
? new ContextVersionJsonDto({
|
||||||
requested: metadata.version.requested,
|
requested: metadata.version.requested,
|
||||||
resolved: metadata.version.resolved,
|
resolved: metadata.version.resolved,
|
||||||
id: metadata.version.id
|
id: metadata.version.id
|
||||||
})
|
})
|
||||||
: null,
|
: null,
|
||||||
resultCount: metadata?.resultCount ?? snippets.length
|
resultCount: metadata?.resultCount ?? snippets.length
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -12,8 +12,7 @@ export class IndexingJobMapper {
|
|||||||
processedFiles: entity.processed_files,
|
processedFiles: entity.processed_files,
|
||||||
error: entity.error,
|
error: entity.error,
|
||||||
startedAt: entity.started_at != null ? new Date(entity.started_at * 1000) : null,
|
startedAt: entity.started_at != null ? new Date(entity.started_at * 1000) : null,
|
||||||
completedAt:
|
completedAt: entity.completed_at != null ? new Date(entity.completed_at * 1000) : null,
|
||||||
entity.completed_at != null ? new Date(entity.completed_at * 1000) : null,
|
|
||||||
createdAt: new Date(entity.created_at * 1000)
|
createdAt: new Date(entity.created_at * 1000)
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,4 +1,8 @@
|
|||||||
import { LibrarySearchResult, SnippetRepositoryRef, SnippetSearchResult } from '$lib/server/models/search-result.js';
|
import {
|
||||||
|
LibrarySearchResult,
|
||||||
|
SnippetRepositoryRef,
|
||||||
|
SnippetSearchResult
|
||||||
|
} from '$lib/server/models/search-result.js';
|
||||||
import { RepositoryMapper } from '$lib/server/mappers/repository.mapper.js';
|
import { RepositoryMapper } from '$lib/server/mappers/repository.mapper.js';
|
||||||
import { RepositoryVersionMapper } from '$lib/server/mappers/repository-version.mapper.js';
|
import { RepositoryVersionMapper } from '$lib/server/mappers/repository-version.mapper.js';
|
||||||
import { SnippetMapper } from '$lib/server/mappers/snippet.mapper.js';
|
import { SnippetMapper } from '$lib/server/mappers/snippet.mapper.js';
|
||||||
@@ -26,9 +30,7 @@ export class SearchResultMapper {
|
|||||||
): LibrarySearchResult {
|
): LibrarySearchResult {
|
||||||
return new LibrarySearchResult({
|
return new LibrarySearchResult({
|
||||||
repository: RepositoryMapper.fromEntity(repositoryEntity),
|
repository: RepositoryMapper.fromEntity(repositoryEntity),
|
||||||
versions: versionEntities.map((version) =>
|
versions: versionEntities.map((version) => RepositoryVersionMapper.fromEntity(version)),
|
||||||
RepositoryVersionMapper.fromEntity(version)
|
|
||||||
),
|
|
||||||
score
|
score
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -71,7 +71,7 @@ export class SnippetOriginJsonDto {
|
|||||||
this.versionId = props.versionId;
|
this.versionId = props.versionId;
|
||||||
this.isLocal = props.isLocal;
|
this.isLocal = props.isLocal;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
export class LibrarySearchJsonResultDto {
|
export class LibrarySearchJsonResultDto {
|
||||||
id: string;
|
id: string;
|
||||||
|
|||||||
@@ -286,7 +286,8 @@ This is the second paragraph that also has enough content to be included here.
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('skips paragraphs shorter than 20 characters', () => {
|
it('skips paragraphs shorter than 20 characters', () => {
|
||||||
const content = 'Short.\n\nThis is a much longer paragraph that definitely passes the minimum length filter.';
|
const content =
|
||||||
|
'Short.\n\nThis is a much longer paragraph that definitely passes the minimum length filter.';
|
||||||
const snippets = parseCodeFile(content, 'notes.txt', 'text');
|
const snippets = parseCodeFile(content, 'notes.txt', 'text');
|
||||||
expect(snippets.length).toBe(1);
|
expect(snippets.length).toBe(1);
|
||||||
});
|
});
|
||||||
@@ -331,7 +332,10 @@ export function realFunction(): string {
|
|||||||
|
|
||||||
describe('parseCodeFile — token count', () => {
|
describe('parseCodeFile — token count', () => {
|
||||||
it('all snippets have tokenCount within MAX_TOKENS', () => {
|
it('all snippets have tokenCount within MAX_TOKENS', () => {
|
||||||
const lines = Array.from({ length: 300 }, (_, i) => `// comment line number ${i} here\nconst x${i} = ${i};`);
|
const lines = Array.from(
|
||||||
|
{ length: 300 },
|
||||||
|
(_, i) => `// comment line number ${i} here\nconst x${i} = ${i};`
|
||||||
|
);
|
||||||
const content = lines.join('\n');
|
const content = lines.join('\n');
|
||||||
|
|
||||||
const snippets = parseCodeFile(content, 'large.ts', 'typescript');
|
const snippets = parseCodeFile(content, 'large.ts', 'typescript');
|
||||||
|
|||||||
@@ -26,15 +26,19 @@ import {
|
|||||||
* The regex is tested line-by-line (multiline flag not needed).
|
* The regex is tested line-by-line (multiline flag not needed).
|
||||||
*/
|
*/
|
||||||
export const BOUNDARY_PATTERNS: Record<string, RegExp> = {
|
export const BOUNDARY_PATTERNS: Record<string, RegExp> = {
|
||||||
typescript: /^(export\s+)?(declare\s+)?(async\s+)?(function|class|interface|type|enum|const|let|var)\s+\w+/,
|
typescript:
|
||||||
|
/^(export\s+)?(declare\s+)?(async\s+)?(function|class|interface|type|enum|const|let|var)\s+\w+/,
|
||||||
javascript: /^(export\s+)?(async\s+)?(function|class|const|let|var)\s+\w+/,
|
javascript: /^(export\s+)?(async\s+)?(function|class|const|let|var)\s+\w+/,
|
||||||
python: /^(async\s+)?(def|class)\s+\w+/,
|
python: /^(async\s+)?(def|class)\s+\w+/,
|
||||||
go: /^(func|type|var|const)\s+\w+/,
|
go: /^(func|type|var|const)\s+\w+/,
|
||||||
rust: /^(pub(\s*\(crate\))?\s+)?(async\s+)?(fn|impl|struct|enum|trait|type|const|static)\s+\w+/,
|
rust: /^(pub(\s*\(crate\))?\s+)?(async\s+)?(fn|impl|struct|enum|trait|type|const|static)\s+\w+/,
|
||||||
java: /^(\s*(public|private|protected|static|final|abstract|synchronized)\s+)+[\w<>\[\]]+\s+\w+\s*[({]/,
|
java: /^(\s*(public|private|protected|static|final|abstract|synchronized)\s+)+[\w<>\[\]]+\s+\w+\s*[({]/,
|
||||||
csharp: /^(\s*(public|private|protected|internal|static|override|virtual|abstract|sealed)\s+)+[\w<>\[\]]+\s+\w+\s*[({]/,
|
csharp:
|
||||||
kotlin: /^(\s*(public|private|protected|internal|override|suspend|inline|open|abstract|sealed)\s+)*(fun|class|object|interface|data class|sealed class|enum class)\s+\w+/,
|
/^(\s*(public|private|protected|internal|static|override|virtual|abstract|sealed)\s+)+[\w<>\[\]]+\s+\w+\s*[({]/,
|
||||||
swift: /^(\s*(public|private|internal|fileprivate|open|override|static|final|class)\s+)*(func|class|struct|enum|protocol|extension)\s+\w+/,
|
kotlin:
|
||||||
|
/^(\s*(public|private|protected|internal|override|suspend|inline|open|abstract|sealed)\s+)*(fun|class|object|interface|data class|sealed class|enum class)\s+\w+/,
|
||||||
|
swift:
|
||||||
|
/^(\s*(public|private|internal|fileprivate|open|override|static|final|class)\s+)*(func|class|struct|enum|protocol|extension)\s+\w+/,
|
||||||
ruby: /^(def|class|module)\s+\w+/
|
ruby: /^(def|class|module)\s+\w+/
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -42,7 +46,10 @@ export const BOUNDARY_PATTERNS: Record<string, RegExp> = {
|
|||||||
// Internal types
|
// Internal types
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
type RawSnippet = Omit<NewSnippet, 'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'>;
|
type RawSnippet = Omit<
|
||||||
|
NewSnippet,
|
||||||
|
'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'
|
||||||
|
>;
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
// Helpers
|
// Helpers
|
||||||
@@ -161,7 +168,10 @@ function parseHtmlLikeFile(content: string, filePath: string, language: string):
|
|||||||
|
|
||||||
while ((match = scriptPattern.exec(content)) !== null) {
|
while ((match = scriptPattern.exec(content)) !== null) {
|
||||||
// Strip the outer tags, keep just the code
|
// Strip the outer tags, keep just the code
|
||||||
const inner = match[0].replace(/^<script[^>]*>/, '').replace(/<\/script>$/, '').trim();
|
const inner = match[0]
|
||||||
|
.replace(/^<script[^>]*>/, '')
|
||||||
|
.replace(/<\/script>$/, '')
|
||||||
|
.trim();
|
||||||
if (inner.length >= MIN_CONTENT_LENGTH) {
|
if (inner.length >= MIN_CONTENT_LENGTH) {
|
||||||
scriptBlocks.push(inner);
|
scriptBlocks.push(inner);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -48,6 +48,13 @@ export function parseFile(file: CrawledFile, options: ParseOptions): NewSnippet[
|
|||||||
|
|
||||||
// Re-export helpers for consumers that need them individually
|
// Re-export helpers for consumers that need them individually
|
||||||
export { detectLanguage } from './language.js';
|
export { detectLanguage } from './language.js';
|
||||||
export { estimateTokens, chunkText, chunkLines, MAX_TOKENS, OVERLAP_TOKENS, MIN_CONTENT_LENGTH } from './chunker.js';
|
export {
|
||||||
|
estimateTokens,
|
||||||
|
chunkText,
|
||||||
|
chunkLines,
|
||||||
|
MAX_TOKENS,
|
||||||
|
OVERLAP_TOKENS,
|
||||||
|
MIN_CONTENT_LENGTH
|
||||||
|
} from './chunker.js';
|
||||||
export { parseMarkdown } from './markdown.parser.js';
|
export { parseMarkdown } from './markdown.parser.js';
|
||||||
export { parseCodeFile, BOUNDARY_PATTERNS } from './code.parser.js';
|
export { parseCodeFile, BOUNDARY_PATTERNS } from './code.parser.js';
|
||||||
|
|||||||
@@ -99,7 +99,10 @@ describe('parseMarkdown — section splitting', () => {
|
|||||||
|
|
||||||
describe('parseMarkdown — code block extraction', () => {
|
describe('parseMarkdown — code block extraction', () => {
|
||||||
it('extracts a fenced code block as a code snippet', () => {
|
it('extracts a fenced code block as a code snippet', () => {
|
||||||
const codeBlock = fence('typescript', 'function hello(name: string): string {\n return `Hello, ${name}!`;\n}');
|
const codeBlock = fence(
|
||||||
|
'typescript',
|
||||||
|
'function hello(name: string): string {\n return `Hello, ${name}!`;\n}'
|
||||||
|
);
|
||||||
const source = [
|
const source = [
|
||||||
'# Example',
|
'# Example',
|
||||||
'',
|
'',
|
||||||
@@ -232,7 +235,10 @@ describe('parseMarkdown — large content chunking', () => {
|
|||||||
describe('parseMarkdown — real-world sample', () => {
|
describe('parseMarkdown — real-world sample', () => {
|
||||||
it('correctly parses a realistic README excerpt', () => {
|
it('correctly parses a realistic README excerpt', () => {
|
||||||
const bashInstall = fence('bash', 'npm install my-library');
|
const bashInstall = fence('bash', 'npm install my-library');
|
||||||
const tsUsage = fence('typescript', "import { doTheThing } from 'my-library';\n\ndoTheThing({ verbose: true });");
|
const tsUsage = fence(
|
||||||
|
'typescript',
|
||||||
|
"import { doTheThing } from 'my-library';\n\ndoTheThing({ verbose: true });"
|
||||||
|
);
|
||||||
|
|
||||||
const source = [
|
const source = [
|
||||||
'# My Library',
|
'# My Library',
|
||||||
|
|||||||
@@ -7,7 +7,13 @@
|
|||||||
|
|
||||||
import { basename } from 'node:path';
|
import { basename } from 'node:path';
|
||||||
import type { NewSnippet } from '$lib/server/db/schema.js';
|
import type { NewSnippet } from '$lib/server/db/schema.js';
|
||||||
import { estimateTokens, chunkText, MAX_TOKENS, OVERLAP_TOKENS, MIN_CONTENT_LENGTH } from './chunker.js';
|
import {
|
||||||
|
estimateTokens,
|
||||||
|
chunkText,
|
||||||
|
MAX_TOKENS,
|
||||||
|
OVERLAP_TOKENS,
|
||||||
|
MIN_CONTENT_LENGTH
|
||||||
|
} from './chunker.js';
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
// Internal types
|
// Internal types
|
||||||
@@ -121,7 +127,10 @@ function splitIntoSections(source: string): MarkdownSection[] {
|
|||||||
// Public parser
|
// Public parser
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
type RawSnippet = Omit<NewSnippet, 'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'>;
|
type RawSnippet = Omit<
|
||||||
|
NewSnippet,
|
||||||
|
'id' | 'repositoryId' | 'documentId' | 'versionId' | 'createdAt'
|
||||||
|
>;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Parse a Markdown/MDX file into raw snippets (before IDs and DB fields are
|
* Parse a Markdown/MDX file into raw snippets (before IDs and DB fields are
|
||||||
|
|||||||
@@ -86,16 +86,16 @@ describe('computeDiff', () => {
|
|||||||
|
|
||||||
it('handles a mixed scenario: added, modified, deleted, and unchanged', () => {
|
it('handles a mixed scenario: added, modified, deleted, and unchanged', () => {
|
||||||
const crawledFiles = [
|
const crawledFiles = [
|
||||||
makeCrawledFile('unchanged.md', 'sha-same'), // unchanged
|
makeCrawledFile('unchanged.md', 'sha-same'), // unchanged
|
||||||
makeCrawledFile('modified.md', 'sha-new'), // modified (different sha)
|
makeCrawledFile('modified.md', 'sha-new'), // modified (different sha)
|
||||||
makeCrawledFile('added.md', 'sha-added') // added (not in DB)
|
makeCrawledFile('added.md', 'sha-added') // added (not in DB)
|
||||||
// 'deleted.md' is absent from crawl → deleted
|
// 'deleted.md' is absent from crawl → deleted
|
||||||
];
|
];
|
||||||
|
|
||||||
const existingDocs = [
|
const existingDocs = [
|
||||||
makeDocument('unchanged.md', 'sha-same'), // unchanged
|
makeDocument('unchanged.md', 'sha-same'), // unchanged
|
||||||
makeDocument('modified.md', 'sha-old'), // modified
|
makeDocument('modified.md', 'sha-old'), // modified
|
||||||
makeDocument('deleted.md', 'sha-deleted') // deleted
|
makeDocument('deleted.md', 'sha-deleted') // deleted
|
||||||
];
|
];
|
||||||
|
|
||||||
const diff = computeDiff(crawledFiles, existingDocs);
|
const diff = computeDiff(crawledFiles, existingDocs);
|
||||||
@@ -114,9 +114,9 @@ describe('computeDiff', () => {
|
|||||||
];
|
];
|
||||||
|
|
||||||
const existingDocs = [
|
const existingDocs = [
|
||||||
makeDocument('a.md', 'sha-a'), // unchanged
|
makeDocument('a.md', 'sha-a'), // unchanged
|
||||||
makeDocument('b.md', 'sha-b-old'), // modified
|
makeDocument('b.md', 'sha-b-old'), // modified
|
||||||
makeDocument('d.md', 'sha-d') // deleted
|
makeDocument('d.md', 'sha-d') // deleted
|
||||||
// 'c.md' is not in DB → added
|
// 'c.md' is not in DB → added
|
||||||
];
|
];
|
||||||
|
|
||||||
|
|||||||
@@ -22,10 +22,7 @@ function createTestDb(): Database.Database {
|
|||||||
client.pragma('foreign_keys = ON');
|
client.pragma('foreign_keys = ON');
|
||||||
|
|
||||||
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
||||||
const migrationSql = readFileSync(
|
const migrationSql = readFileSync(join(migrationsFolder, '0000_large_master_chief.sql'), 'utf-8');
|
||||||
join(migrationsFolder, '0000_large_master_chief.sql'),
|
|
||||||
'utf-8'
|
|
||||||
);
|
|
||||||
|
|
||||||
const statements = migrationSql
|
const statements = migrationSql
|
||||||
.split('--> statement-breakpoint')
|
.split('--> statement-breakpoint')
|
||||||
@@ -45,10 +42,7 @@ function createTestDb(): Database.Database {
|
|||||||
|
|
||||||
const now = Math.floor(Date.now() / 1000);
|
const now = Math.floor(Date.now() / 1000);
|
||||||
|
|
||||||
function insertRepo(
|
function insertRepo(db: Database.Database, overrides: Partial<Record<string, unknown>> = {}): void {
|
||||||
db: Database.Database,
|
|
||||||
overrides: Partial<Record<string, unknown>> = {}
|
|
||||||
): void {
|
|
||||||
db.prepare(
|
db.prepare(
|
||||||
`INSERT INTO repositories
|
`INSERT INTO repositories
|
||||||
(id, title, source, source_url, branch, state,
|
(id, title, source, source_url, branch, state,
|
||||||
@@ -62,7 +56,15 @@ function insertRepo(
|
|||||||
overrides.source_url ?? '/tmp/test-repo',
|
overrides.source_url ?? '/tmp/test-repo',
|
||||||
overrides.branch ?? 'main',
|
overrides.branch ?? 'main',
|
||||||
overrides.state ?? 'pending',
|
overrides.state ?? 'pending',
|
||||||
0, 0, 0, 0, null, null, null, now, now
|
0,
|
||||||
|
0,
|
||||||
|
0,
|
||||||
|
0,
|
||||||
|
null,
|
||||||
|
null,
|
||||||
|
null,
|
||||||
|
now,
|
||||||
|
now
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -108,9 +110,10 @@ describe('recoverStaleJobs', () => {
|
|||||||
insertJob(db, { status: 'running' });
|
insertJob(db, { status: 'running' });
|
||||||
recoverStaleJobs(db);
|
recoverStaleJobs(db);
|
||||||
|
|
||||||
const row = db
|
const row = db.prepare(`SELECT status, error FROM indexing_jobs LIMIT 1`).get() as {
|
||||||
.prepare(`SELECT status, error FROM indexing_jobs LIMIT 1`)
|
status: string;
|
||||||
.get() as { status: string; error: string };
|
error: string;
|
||||||
|
};
|
||||||
expect(row.status).toBe('failed');
|
expect(row.status).toBe('failed');
|
||||||
expect(row.error).toMatch(/restarted/i);
|
expect(row.error).toMatch(/restarted/i);
|
||||||
});
|
});
|
||||||
@@ -119,9 +122,9 @@ describe('recoverStaleJobs', () => {
|
|||||||
db.prepare(`UPDATE repositories SET state = 'indexing' WHERE id = '/test/repo'`).run();
|
db.prepare(`UPDATE repositories SET state = 'indexing' WHERE id = '/test/repo'`).run();
|
||||||
recoverStaleJobs(db);
|
recoverStaleJobs(db);
|
||||||
|
|
||||||
const row = db
|
const row = db.prepare(`SELECT state FROM repositories WHERE id = '/test/repo'`).get() as {
|
||||||
.prepare(`SELECT state FROM repositories WHERE id = '/test/repo'`)
|
state: string;
|
||||||
.get() as { state: string };
|
};
|
||||||
expect(row.state).toBe('error');
|
expect(row.state).toBe('error');
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -164,9 +167,7 @@ describe('JobQueue', () => {
|
|||||||
const job2 = queue.enqueue('/test/repo');
|
const job2 = queue.enqueue('/test/repo');
|
||||||
expect(job1.id).toBe(job2.id);
|
expect(job1.id).toBe(job2.id);
|
||||||
|
|
||||||
const count = (
|
const count = (db.prepare(`SELECT COUNT(*) as n FROM indexing_jobs`).get() as { n: number }).n;
|
||||||
db.prepare(`SELECT COUNT(*) as n FROM indexing_jobs`).get() as { n: number }
|
|
||||||
).n;
|
|
||||||
expect(count).toBe(1);
|
expect(count).toBe(1);
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -255,19 +256,19 @@ describe('IndexingPipeline', () => {
|
|||||||
})
|
})
|
||||||
};
|
};
|
||||||
|
|
||||||
return new IndexingPipeline(
|
return new IndexingPipeline(db, mockGithubCrawl as never, mockLocalCrawler as never, null);
|
||||||
db,
|
|
||||||
mockGithubCrawl as never,
|
|
||||||
mockLocalCrawler as never,
|
|
||||||
null
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
function makeJob(repositoryId = '/test/repo') {
|
function makeJob(repositoryId = '/test/repo') {
|
||||||
const jobId = insertJob(db, { repository_id: repositoryId, status: 'queued' });
|
const jobId = insertJob(db, { repository_id: repositoryId, status: 'queued' });
|
||||||
return db
|
return db.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`).get(jobId) as {
|
||||||
.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`)
|
id: string;
|
||||||
.get(jobId) as { id: string; repositoryId?: string; repository_id?: string; status: string; versionId?: string; version_id?: string };
|
repositoryId?: string;
|
||||||
|
repository_id?: string;
|
||||||
|
status: string;
|
||||||
|
versionId?: string;
|
||||||
|
version_id?: string;
|
||||||
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
it('marks job as done when there are no files to index', async () => {
|
it('marks job as done when there are no files to index', async () => {
|
||||||
@@ -289,9 +290,9 @@ describe('IndexingPipeline', () => {
|
|||||||
|
|
||||||
await pipeline.run(job as never);
|
await pipeline.run(job as never);
|
||||||
|
|
||||||
const updated = db
|
const updated = db.prepare(`SELECT status FROM indexing_jobs WHERE id = ?`).get(job.id) as {
|
||||||
.prepare(`SELECT status FROM indexing_jobs WHERE id = ?`)
|
status: string;
|
||||||
.get(job.id) as { status: string };
|
};
|
||||||
// The job should end in 'done' — the running→done transition is covered
|
// The job should end in 'done' — the running→done transition is covered
|
||||||
// by the pipeline's internal updateJob calls.
|
// by the pipeline's internal updateJob calls.
|
||||||
expect(updated.status).toBe('done');
|
expect(updated.status).toBe('done');
|
||||||
@@ -363,27 +364,24 @@ describe('IndexingPipeline', () => {
|
|||||||
const job1 = makeJob();
|
const job1 = makeJob();
|
||||||
await pipeline.run(job1 as never);
|
await pipeline.run(job1 as never);
|
||||||
|
|
||||||
const firstDocCount = (
|
const firstDocCount = (db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number })
|
||||||
db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number }
|
.n;
|
||||||
).n;
|
const firstSnippetIds = (db.prepare(`SELECT id FROM snippets`).all() as { id: string }[]).map(
|
||||||
const firstSnippetIds = (
|
(r) => r.id
|
||||||
db.prepare(`SELECT id FROM snippets`).all() as { id: string }[]
|
);
|
||||||
).map((r) => r.id);
|
|
||||||
|
|
||||||
// Second run with identical files.
|
// Second run with identical files.
|
||||||
const job2Id = insertJob(db, { repository_id: '/test/repo', status: 'queued' });
|
const job2Id = insertJob(db, { repository_id: '/test/repo', status: 'queued' });
|
||||||
const job2 = db
|
const job2 = db.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`).get(job2Id) as never;
|
||||||
.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`)
|
|
||||||
.get(job2Id) as never;
|
|
||||||
|
|
||||||
await pipeline.run(job2);
|
await pipeline.run(job2);
|
||||||
|
|
||||||
const secondDocCount = (
|
const secondDocCount = (
|
||||||
db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number }
|
db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number }
|
||||||
).n;
|
).n;
|
||||||
const secondSnippetIds = (
|
const secondSnippetIds = (db.prepare(`SELECT id FROM snippets`).all() as { id: string }[]).map(
|
||||||
db.prepare(`SELECT id FROM snippets`).all() as { id: string }[]
|
(r) => r.id
|
||||||
).map((r) => r.id);
|
);
|
||||||
|
|
||||||
// Document count stays the same and snippet IDs are unchanged.
|
// Document count stays the same and snippet IDs are unchanged.
|
||||||
expect(secondDocCount).toBe(firstDocCount);
|
expect(secondDocCount).toBe(firstDocCount);
|
||||||
@@ -395,7 +393,8 @@ describe('IndexingPipeline', () => {
|
|||||||
files: [
|
files: [
|
||||||
{
|
{
|
||||||
path: 'README.md',
|
path: 'README.md',
|
||||||
content: '# Original\n\nThis is the original version of the documentation with sufficient content.',
|
content:
|
||||||
|
'# Original\n\nThis is the original version of the documentation with sufficient content.',
|
||||||
sha: 'sha-v1',
|
sha: 'sha-v1',
|
||||||
language: 'markdown'
|
language: 'markdown'
|
||||||
}
|
}
|
||||||
@@ -415,7 +414,8 @@ describe('IndexingPipeline', () => {
|
|||||||
files: [
|
files: [
|
||||||
{
|
{
|
||||||
path: 'README.md',
|
path: 'README.md',
|
||||||
content: '# Updated\n\nThis is a completely different version of the documentation with new content.',
|
content:
|
||||||
|
'# Updated\n\nThis is a completely different version of the documentation with new content.',
|
||||||
sha: 'sha-v2',
|
sha: 'sha-v2',
|
||||||
language: 'markdown'
|
language: 'markdown'
|
||||||
}
|
}
|
||||||
@@ -423,14 +423,11 @@ describe('IndexingPipeline', () => {
|
|||||||
totalFiles: 1
|
totalFiles: 1
|
||||||
});
|
});
|
||||||
const job2Id = insertJob(db, { repository_id: '/test/repo', status: 'queued' });
|
const job2Id = insertJob(db, { repository_id: '/test/repo', status: 'queued' });
|
||||||
const job2 = db
|
const job2 = db.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`).get(job2Id) as never;
|
||||||
.prepare(`SELECT * FROM indexing_jobs WHERE id = ?`)
|
|
||||||
.get(job2Id) as never;
|
|
||||||
await pipeline2.run(job2);
|
await pipeline2.run(job2);
|
||||||
|
|
||||||
const finalDocCount = (
|
const finalDocCount = (db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number })
|
||||||
db.prepare(`SELECT COUNT(*) as n FROM documents`).get() as { n: number }
|
.n;
|
||||||
).n;
|
|
||||||
// Only one document should exist (the updated one).
|
// Only one document should exist (the updated one).
|
||||||
expect(finalDocCount).toBe(1);
|
expect(finalDocCount).toBe(1);
|
||||||
|
|
||||||
@@ -452,9 +449,9 @@ describe('IndexingPipeline', () => {
|
|||||||
const job = makeJob();
|
const job = makeJob();
|
||||||
await pipeline.run(job as never);
|
await pipeline.run(job as never);
|
||||||
|
|
||||||
const updated = db
|
const updated = db.prepare(`SELECT progress FROM indexing_jobs WHERE id = ?`).get(job.id) as {
|
||||||
.prepare(`SELECT progress FROM indexing_jobs WHERE id = ?`)
|
progress: number;
|
||||||
.get(job.id) as { progress: number };
|
};
|
||||||
expect(updated.progress).toBe(100);
|
expect(updated.progress).toBe(100);
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -467,12 +464,7 @@ describe('IndexingPipeline', () => {
|
|||||||
commitSha: 'abc'
|
commitSha: 'abc'
|
||||||
});
|
});
|
||||||
|
|
||||||
const pipeline = new IndexingPipeline(
|
const pipeline = new IndexingPipeline(db, vi.fn() as never, { crawl } as never, null);
|
||||||
db,
|
|
||||||
vi.fn() as never,
|
|
||||||
{ crawl } as never,
|
|
||||||
null
|
|
||||||
);
|
|
||||||
|
|
||||||
const job = makeJob();
|
const job = makeJob();
|
||||||
await pipeline.run(job as never);
|
await pipeline.run(job as never);
|
||||||
@@ -511,7 +503,10 @@ describe('IndexingPipeline', () => {
|
|||||||
await pipeline1.run(job1 as never);
|
await pipeline1.run(job1 as never);
|
||||||
|
|
||||||
const afterFirstRun = {
|
const afterFirstRun = {
|
||||||
docs: db.prepare(`SELECT file_path, checksum FROM documents ORDER BY file_path`).all() as { file_path: string; checksum: string }[],
|
docs: db.prepare(`SELECT file_path, checksum FROM documents ORDER BY file_path`).all() as {
|
||||||
|
file_path: string;
|
||||||
|
checksum: string;
|
||||||
|
}[],
|
||||||
snippetCount: (db.prepare(`SELECT COUNT(*) as n FROM snippets`).get() as { n: number }).n
|
snippetCount: (db.prepare(`SELECT COUNT(*) as n FROM snippets`).get() as { n: number }).n
|
||||||
};
|
};
|
||||||
expect(afterFirstRun.docs).toHaveLength(3);
|
expect(afterFirstRun.docs).toHaveLength(3);
|
||||||
|
|||||||
@@ -250,7 +250,10 @@ export class IndexingPipeline {
|
|||||||
private async crawl(
|
private async crawl(
|
||||||
repo: Repository,
|
repo: Repository,
|
||||||
job: IndexingJob
|
job: IndexingJob
|
||||||
): Promise<{ files: Array<{ path: string; content: string; sha: string; size: number; language: string }>; totalFiles: number }> {
|
): Promise<{
|
||||||
|
files: Array<{ path: string; content: string; sha: string; size: number; language: string }>;
|
||||||
|
totalFiles: number;
|
||||||
|
}> {
|
||||||
if (repo.source === 'github') {
|
if (repo.source === 'github') {
|
||||||
// Parse owner/repo from the canonical ID: "/owner/repo"
|
// Parse owner/repo from the canonical ID: "/owner/repo"
|
||||||
const parts = repo.id.replace(/^\//, '').split('/');
|
const parts = repo.id.replace(/^\//, '').split('/');
|
||||||
|
|||||||
@@ -133,9 +133,7 @@ export class JobQueue {
|
|||||||
|
|
||||||
// Check whether another job was queued while this one ran.
|
// Check whether another job was queued while this one ran.
|
||||||
const next = this.db
|
const next = this.db
|
||||||
.prepare<[], { id: string }>(
|
.prepare<[], { id: string }>(`SELECT id FROM indexing_jobs WHERE status = 'queued' LIMIT 1`)
|
||||||
`SELECT id FROM indexing_jobs WHERE status = 'queued' LIMIT 1`
|
|
||||||
)
|
|
||||||
.get();
|
.get();
|
||||||
if (next) {
|
if (next) {
|
||||||
setImmediate(() => this.processNext());
|
setImmediate(() => this.processNext());
|
||||||
@@ -147,9 +145,7 @@ export class JobQueue {
|
|||||||
* Retrieve a single job by ID.
|
* Retrieve a single job by ID.
|
||||||
*/
|
*/
|
||||||
getJob(id: string): IndexingJob | null {
|
getJob(id: string): IndexingJob | null {
|
||||||
const raw = this.db
|
const raw = this.db.prepare<[string], IndexingJobEntity>(`${JOB_SELECT} WHERE id = ?`).get(id);
|
||||||
.prepare<[string], IndexingJobEntity>(`${JOB_SELECT} WHERE id = ?`)
|
|
||||||
.get(id);
|
|
||||||
return raw ? IndexingJobMapper.fromEntity(new IndexingJobEntity(raw)) : null;
|
return raw ? IndexingJobMapper.fromEntity(new IndexingJobEntity(raw)) : null;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -178,9 +174,9 @@ export class JobQueue {
|
|||||||
const sql = `${JOB_SELECT} ${where} ORDER BY created_at DESC LIMIT ?`;
|
const sql = `${JOB_SELECT} ${where} ORDER BY created_at DESC LIMIT ?`;
|
||||||
params.push(limit);
|
params.push(limit);
|
||||||
|
|
||||||
return (this.db.prepare<unknown[], IndexingJobEntity>(sql).all(...params) as IndexingJobEntity[]).map(
|
return (
|
||||||
(row) => IndexingJobMapper.fromEntity(new IndexingJobEntity(row))
|
this.db.prepare<unknown[], IndexingJobEntity>(sql).all(...params) as IndexingJobEntity[]
|
||||||
);
|
).map((row) => IndexingJobMapper.fromEntity(new IndexingJobEntity(row)));
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -228,9 +224,7 @@ export class JobQueue {
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
this.db
|
this.db.prepare(`UPDATE indexing_jobs SET status = 'paused' WHERE id = ?`).run(id);
|
||||||
.prepare(`UPDATE indexing_jobs SET status = 'paused' WHERE id = ?`)
|
|
||||||
.run(id);
|
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
@@ -249,9 +243,7 @@ export class JobQueue {
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
this.db
|
this.db.prepare(`UPDATE indexing_jobs SET status = 'queued' WHERE id = ?`).run(id);
|
||||||
.prepare(`UPDATE indexing_jobs SET status = 'queued' WHERE id = ?`)
|
|
||||||
.run(id);
|
|
||||||
|
|
||||||
// Trigger queue processing in case the queue was idle
|
// Trigger queue processing in case the queue was idle
|
||||||
this.drainQueued();
|
this.drainQueued();
|
||||||
|
|||||||
@@ -27,7 +27,11 @@ function createTestDb(): Database.Database {
|
|||||||
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
||||||
|
|
||||||
// Run all migrations in order
|
// Run all migrations in order
|
||||||
const migrations = ['0000_large_master_chief.sql', '0001_quick_nighthawk.sql', '0002_silky_stellaris.sql'];
|
const migrations = [
|
||||||
|
'0000_large_master_chief.sql',
|
||||||
|
'0001_quick_nighthawk.sql',
|
||||||
|
'0002_silky_stellaris.sql'
|
||||||
|
];
|
||||||
for (const migrationFile of migrations) {
|
for (const migrationFile of migrations) {
|
||||||
const migrationSql = readFileSync(join(migrationsFolder, migrationFile), 'utf-8');
|
const migrationSql = readFileSync(join(migrationsFolder, migrationFile), 'utf-8');
|
||||||
const statements = migrationSql
|
const statements = migrationSql
|
||||||
@@ -123,9 +127,7 @@ function seedEmbedding(
|
|||||||
// Mock EmbeddingProvider
|
// Mock EmbeddingProvider
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
function makeMockProvider(
|
function makeMockProvider(returnValues: number[][] = [[1, 0, 0, 0]]): EmbeddingProvider {
|
||||||
returnValues: number[][] = [[1, 0, 0, 0]]
|
|
||||||
): EmbeddingProvider {
|
|
||||||
return {
|
return {
|
||||||
name: 'mock',
|
name: 'mock',
|
||||||
dimensions: returnValues[0]?.length ?? 4,
|
dimensions: returnValues[0]?.length ?? 4,
|
||||||
@@ -254,9 +256,18 @@ describe('reciprocalRankFusion', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('handles three lists correctly', () => {
|
it('handles three lists correctly', () => {
|
||||||
const r1 = [{ id: 'a', score: 1 }, { id: 'b', score: 0 }];
|
const r1 = [
|
||||||
const r2 = [{ id: 'b', score: 1 }, { id: 'c', score: 0 }];
|
{ id: 'a', score: 1 },
|
||||||
const r3 = [{ id: 'a', score: 1 }, { id: 'c', score: 0 }];
|
{ id: 'b', score: 0 }
|
||||||
|
];
|
||||||
|
const r2 = [
|
||||||
|
{ id: 'b', score: 1 },
|
||||||
|
{ id: 'c', score: 0 }
|
||||||
|
];
|
||||||
|
const r3 = [
|
||||||
|
{ id: 'a', score: 1 },
|
||||||
|
{ id: 'c', score: 0 }
|
||||||
|
];
|
||||||
const result = reciprocalRankFusion(r1, r2, r3);
|
const result = reciprocalRankFusion(r1, r2, r3);
|
||||||
// 'a' appears first in r1 and r3 → higher combined score than 'b' or 'c'.
|
// 'a' appears first in r1 and r3 → higher combined score than 'b' or 'c'.
|
||||||
expect(result[0].id).toBe('a');
|
expect(result[0].id).toBe('a');
|
||||||
|
|||||||
@@ -103,10 +103,7 @@ export class HybridSearchService {
|
|||||||
* @param options - Search parameters including repositoryId and alpha blend.
|
* @param options - Search parameters including repositoryId and alpha blend.
|
||||||
* @returns Ranked array of SnippetSearchResult, deduplicated by snippet ID.
|
* @returns Ranked array of SnippetSearchResult, deduplicated by snippet ID.
|
||||||
*/
|
*/
|
||||||
async search(
|
async search(query: string, options: HybridSearchOptions): Promise<SnippetSearchResult[]> {
|
||||||
query: string,
|
|
||||||
options: HybridSearchOptions
|
|
||||||
): Promise<SnippetSearchResult[]> {
|
|
||||||
const limit = options.limit ?? 20;
|
const limit = options.limit ?? 20;
|
||||||
const mode = options.searchMode ?? 'auto';
|
const mode = options.searchMode ?? 'auto';
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,7 @@
|
|||||||
*/
|
*/
|
||||||
export function preprocessQuery(raw: string): string {
|
export function preprocessQuery(raw: string): string {
|
||||||
// 1. Trim and collapse whitespace.
|
// 1. Trim and collapse whitespace.
|
||||||
let q = raw.trim().replace(/\s+/g, ' ');
|
const q = raw.trim().replace(/\s+/g, ' ');
|
||||||
|
|
||||||
if (!q) return q;
|
if (!q) return q;
|
||||||
|
|
||||||
@@ -91,7 +91,10 @@ export function preprocessQuery(raw: string): string {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Remove trailing operators
|
// Remove trailing operators
|
||||||
while (finalTokens.length > 0 && ['AND', 'OR', 'NOT'].includes(finalTokens[finalTokens.length - 1])) {
|
while (
|
||||||
|
finalTokens.length > 0 &&
|
||||||
|
['AND', 'OR', 'NOT'].includes(finalTokens[finalTokens.length - 1])
|
||||||
|
) {
|
||||||
finalTokens.pop();
|
finalTokens.pop();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -32,9 +32,7 @@ export interface FusedItem {
|
|||||||
* descending relevance (index 0 = most relevant).
|
* descending relevance (index 0 = most relevant).
|
||||||
* @returns Fused array sorted by descending rrfScore, deduplicated by id.
|
* @returns Fused array sorted by descending rrfScore, deduplicated by id.
|
||||||
*/
|
*/
|
||||||
export function reciprocalRankFusion(
|
export function reciprocalRankFusion(...rankings: Array<Array<RankedItem>>): Array<FusedItem> {
|
||||||
...rankings: Array<Array<RankedItem>>
|
|
||||||
): Array<FusedItem> {
|
|
||||||
const K = 60; // Standard RRF constant.
|
const K = 60; // Standard RRF constant.
|
||||||
const scores = new Map<string, number>();
|
const scores = new Map<string, number>();
|
||||||
|
|
||||||
|
|||||||
@@ -674,7 +674,9 @@ describe('formatLibraryResults', () => {
|
|||||||
id: '/facebook/react/v18',
|
id: '/facebook/react/v18',
|
||||||
repositoryId: '/facebook/react',
|
repositoryId: '/facebook/react',
|
||||||
tag: 'v18',
|
tag: 'v18',
|
||||||
title: 'React 18', commitHash: null, state: 'indexed',
|
title: 'React 18',
|
||||||
|
commitHash: null,
|
||||||
|
state: 'indexed',
|
||||||
totalSnippets: 1000,
|
totalSnippets: 1000,
|
||||||
indexedAt: null,
|
indexedAt: null,
|
||||||
createdAt: now
|
createdAt: now
|
||||||
@@ -731,7 +733,9 @@ describe('formatLibraryResults', () => {
|
|||||||
describe('formatSnippetResults', () => {
|
describe('formatSnippetResults', () => {
|
||||||
const now = new Date();
|
const now = new Date();
|
||||||
|
|
||||||
function makeSnippetResult(overrides: Partial<Parameters<typeof formatSnippetResults>[0][number]> = {}): Parameters<typeof formatSnippetResults>[0][number] {
|
function makeSnippetResult(
|
||||||
|
overrides: Partial<Parameters<typeof formatSnippetResults>[0][number]> = {}
|
||||||
|
): Parameters<typeof formatSnippetResults>[0][number] {
|
||||||
return {
|
return {
|
||||||
snippet: {
|
snippet: {
|
||||||
id: crypto.randomUUID(),
|
id: crypto.randomUUID(),
|
||||||
|
|||||||
@@ -87,10 +87,7 @@ export class SearchService {
|
|||||||
if (!processedQuery) return [];
|
if (!processedQuery) return [];
|
||||||
|
|
||||||
// Build the WHERE clause dynamically based on optional filters.
|
// Build the WHERE clause dynamically based on optional filters.
|
||||||
const conditions: string[] = [
|
const conditions: string[] = ['snippets_fts MATCH ?', 's.repository_id = ?'];
|
||||||
'snippets_fts MATCH ?',
|
|
||||||
's.repository_id = ?'
|
|
||||||
];
|
|
||||||
const params: unknown[] = [processedQuery, repositoryId];
|
const params: unknown[] = [processedQuery, repositoryId];
|
||||||
|
|
||||||
if (versionId !== undefined) {
|
if (versionId !== undefined) {
|
||||||
@@ -132,10 +129,14 @@ export class SearchService {
|
|||||||
const rows = this.db.prepare(sql).all(...params) as RawSnippetRow[];
|
const rows = this.db.prepare(sql).all(...params) as RawSnippetRow[];
|
||||||
|
|
||||||
return rows.map((row) =>
|
return rows.map((row) =>
|
||||||
SearchResultMapper.snippetFromEntity(new SnippetEntity(row), {
|
SearchResultMapper.snippetFromEntity(
|
||||||
id: row.repo_id,
|
new SnippetEntity(row),
|
||||||
title: row.repo_title
|
{
|
||||||
}, row.score)
|
id: row.repo_id,
|
||||||
|
title: row.repo_title
|
||||||
|
},
|
||||||
|
row.score
|
||||||
|
)
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -188,7 +189,11 @@ export class SearchService {
|
|||||||
|
|
||||||
return rows.map((row) => {
|
return rows.map((row) => {
|
||||||
const compositeScore =
|
const compositeScore =
|
||||||
row.exact_match + row.prefix_match + row.desc_match + row.snippet_score + row.trust_component;
|
row.exact_match +
|
||||||
|
row.prefix_match +
|
||||||
|
row.desc_match +
|
||||||
|
row.snippet_score +
|
||||||
|
row.trust_component;
|
||||||
return SearchResultMapper.libraryFromEntity(
|
return SearchResultMapper.libraryFromEntity(
|
||||||
new RepositoryEntity(row),
|
new RepositoryEntity(row),
|
||||||
this.getVersionEntities(row.id),
|
this.getVersionEntities(row.id),
|
||||||
@@ -203,9 +208,7 @@ export class SearchService {
|
|||||||
|
|
||||||
private getVersionEntities(repositoryId: string): RepositoryVersionEntity[] {
|
private getVersionEntities(repositoryId: string): RepositoryVersionEntity[] {
|
||||||
return this.db
|
return this.db
|
||||||
.prepare(
|
.prepare(`SELECT * FROM repository_versions WHERE repository_id = ? ORDER BY created_at DESC`)
|
||||||
`SELECT * FROM repository_versions WHERE repository_id = ? ORDER BY created_at DESC`
|
|
||||||
)
|
|
||||||
.all(repositoryId) as RawVersionRow[];
|
.all(repositoryId) as RawVersionRow[];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -46,9 +46,7 @@ interface RawEmbeddingRow {
|
|||||||
*/
|
*/
|
||||||
export function cosineSimilarity(a: Float32Array, b: Float32Array): number {
|
export function cosineSimilarity(a: Float32Array, b: Float32Array): number {
|
||||||
if (a.length !== b.length) {
|
if (a.length !== b.length) {
|
||||||
throw new Error(
|
throw new Error(`Embedding dimension mismatch: ${a.length} vs ${b.length}`);
|
||||||
`Embedding dimension mismatch: ${a.length} vs ${b.length}`
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
let dot = 0;
|
let dot = 0;
|
||||||
|
|||||||
@@ -27,10 +27,7 @@ function createTestDb(): Database.Database {
|
|||||||
client.pragma('foreign_keys = ON');
|
client.pragma('foreign_keys = ON');
|
||||||
|
|
||||||
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
||||||
const migrationSql = readFileSync(
|
const migrationSql = readFileSync(join(migrationsFolder, '0000_large_master_chief.sql'), 'utf-8');
|
||||||
join(migrationsFolder, '0000_large_master_chief.sql'),
|
|
||||||
'utf-8'
|
|
||||||
);
|
|
||||||
|
|
||||||
// Drizzle migration files use `--> statement-breakpoint` as separator.
|
// Drizzle migration files use `--> statement-breakpoint` as separator.
|
||||||
const statements = migrationSql
|
const statements = migrationSql
|
||||||
@@ -261,9 +258,7 @@ describe('RepositoryService.add()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws InvalidInputError when sourceUrl is empty', () => {
|
it('throws InvalidInputError when sourceUrl is empty', () => {
|
||||||
expect(() =>
|
expect(() => service.add({ source: 'github', sourceUrl: '' })).toThrow(InvalidInputError);
|
||||||
service.add({ source: 'github', sourceUrl: '' })
|
|
||||||
).toThrow(InvalidInputError);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
it('stores description and branch when provided', () => {
|
it('stores description and branch when provided', () => {
|
||||||
@@ -321,9 +316,7 @@ describe('RepositoryService.update()', () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
it('throws NotFoundError for a non-existent repository', () => {
|
it('throws NotFoundError for a non-existent repository', () => {
|
||||||
expect(() =>
|
expect(() => service.update('/not/found', { title: 'New Title' })).toThrow(NotFoundError);
|
||||||
service.update('/not/found', { title: 'New Title' })
|
|
||||||
).toThrow(NotFoundError);
|
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|||||||
@@ -74,9 +74,7 @@ export class RepositoryService {
|
|||||||
.get(state) as { n: number };
|
.get(state) as { n: number };
|
||||||
return row.n;
|
return row.n;
|
||||||
}
|
}
|
||||||
const row = this.db
|
const row = this.db.prepare(`SELECT COUNT(*) as n FROM repositories`).get() as { n: number };
|
||||||
.prepare(`SELECT COUNT(*) as n FROM repositories`)
|
|
||||||
.get() as { n: number };
|
|
||||||
return row.n;
|
return row.n;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -115,13 +113,13 @@ export class RepositoryService {
|
|||||||
}
|
}
|
||||||
// Default title from owner/repo
|
// Default title from owner/repo
|
||||||
const parts = id.split('/').filter(Boolean);
|
const parts = id.split('/').filter(Boolean);
|
||||||
title = input.title ?? (parts[1] ?? id);
|
title = input.title ?? parts[1] ?? id;
|
||||||
} else {
|
} else {
|
||||||
// local
|
// local
|
||||||
const existing = this.list({ limit: 9999 }).map((r) => r.id);
|
const existing = this.list({ limit: 9999 }).map((r) => r.id);
|
||||||
id = resolveLocalId(input.sourceUrl, existing);
|
id = resolveLocalId(input.sourceUrl, existing);
|
||||||
const parts = input.sourceUrl.split('/');
|
const parts = input.sourceUrl.split('/');
|
||||||
title = input.title ?? (parts.at(-1) ?? 'local-repo');
|
title = input.title ?? parts.at(-1) ?? 'local-repo';
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check for collision
|
// Check for collision
|
||||||
|
|||||||
@@ -25,14 +25,8 @@ function createTestDb(): Database.Database {
|
|||||||
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
const migrationsFolder = join(import.meta.dirname, '../db/migrations');
|
||||||
|
|
||||||
// Apply all migration files in order
|
// Apply all migration files in order
|
||||||
const migration0 = readFileSync(
|
const migration0 = readFileSync(join(migrationsFolder, '0000_large_master_chief.sql'), 'utf-8');
|
||||||
join(migrationsFolder, '0000_large_master_chief.sql'),
|
const migration1 = readFileSync(join(migrationsFolder, '0001_quick_nighthawk.sql'), 'utf-8');
|
||||||
'utf-8'
|
|
||||||
);
|
|
||||||
const migration1 = readFileSync(
|
|
||||||
join(migrationsFolder, '0001_quick_nighthawk.sql'),
|
|
||||||
'utf-8'
|
|
||||||
);
|
|
||||||
|
|
||||||
// Apply first migration
|
// Apply first migration
|
||||||
const statements0 = migration0
|
const statements0 = migration0
|
||||||
@@ -201,9 +195,7 @@ describe('VersionService.remove()', () => {
|
|||||||
|
|
||||||
versionService.remove('/facebook/react', 'v18.3.0');
|
versionService.remove('/facebook/react', 'v18.3.0');
|
||||||
|
|
||||||
const doc = client
|
const doc = client.prepare(`SELECT id FROM documents WHERE id = ?`).get(docId);
|
||||||
.prepare(`SELECT id FROM documents WHERE id = ?`)
|
|
||||||
.get(docId);
|
|
||||||
expect(doc).toBeUndefined();
|
expect(doc).toBeUndefined();
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -40,12 +40,7 @@ export class VersionService {
|
|||||||
* @throws NotFoundError when the parent repository does not exist
|
* @throws NotFoundError when the parent repository does not exist
|
||||||
* @throws AlreadyExistsError when the tag is already registered
|
* @throws AlreadyExistsError when the tag is already registered
|
||||||
*/
|
*/
|
||||||
add(
|
add(repositoryId: string, tag: string, title?: string, commitHash?: string): RepositoryVersion {
|
||||||
repositoryId: string,
|
|
||||||
tag: string,
|
|
||||||
title?: string,
|
|
||||||
commitHash?: string
|
|
||||||
): RepositoryVersion {
|
|
||||||
// Verify parent repository exists.
|
// Verify parent repository exists.
|
||||||
const repo = this.db
|
const repo = this.db
|
||||||
.prepare(`SELECT id, source, source_url FROM repositories WHERE id = ?`)
|
.prepare(`SELECT id, source, source_url FROM repositories WHERE id = ?`)
|
||||||
@@ -115,9 +110,7 @@ export class VersionService {
|
|||||||
*/
|
*/
|
||||||
getByTag(repositoryId: string, tag: string): RepositoryVersion | null {
|
getByTag(repositoryId: string, tag: string): RepositoryVersion | null {
|
||||||
const row = this.db
|
const row = this.db
|
||||||
.prepare(
|
.prepare(`SELECT * FROM repository_versions WHERE repository_id = ? AND tag = ?`)
|
||||||
`SELECT * FROM repository_versions WHERE repository_id = ? AND tag = ?`
|
|
||||||
)
|
|
||||||
.get(repositoryId, tag) as RepositoryVersionEntity | undefined;
|
.get(repositoryId, tag) as RepositoryVersionEntity | undefined;
|
||||||
return row ? RepositoryVersionMapper.fromEntity(new RepositoryVersionEntity(row)) : null;
|
return row ? RepositoryVersionMapper.fromEntity(new RepositoryVersionEntity(row)) : null;
|
||||||
}
|
}
|
||||||
@@ -137,9 +130,9 @@ export class VersionService {
|
|||||||
previousVersions: { tag: string; title: string; commitHash?: string }[]
|
previousVersions: { tag: string; title: string; commitHash?: string }[]
|
||||||
): RepositoryVersion[] {
|
): RepositoryVersion[] {
|
||||||
// Verify parent repository exists.
|
// Verify parent repository exists.
|
||||||
const repo = this.db
|
const repo = this.db.prepare(`SELECT id FROM repositories WHERE id = ?`).get(repositoryId) as
|
||||||
.prepare(`SELECT id FROM repositories WHERE id = ?`)
|
| { id: string }
|
||||||
.get(repositoryId) as { id: string } | undefined;
|
| undefined;
|
||||||
|
|
||||||
if (!repo) {
|
if (!repo) {
|
||||||
throw new NotFoundError(`Repository ${repositoryId} not found`);
|
throw new NotFoundError(`Repository ${repositoryId} not found`);
|
||||||
|
|||||||
@@ -65,13 +65,10 @@ export function discoverVersionTags(options: DiscoverTagsOptions): string[] {
|
|||||||
|
|
||||||
try {
|
try {
|
||||||
// List all tags, sorted by commit date (newest first)
|
// List all tags, sorted by commit date (newest first)
|
||||||
const output = execSync(
|
const output = execSync(`git -C "${repoPath}" tag -l --sort=-creatordate`, {
|
||||||
`git -C "${repoPath}" tag -l --sort=-creatordate`,
|
encoding: 'utf-8',
|
||||||
{
|
stdio: ['ignore', 'pipe', 'pipe']
|
||||||
encoding: 'utf-8',
|
}).trim();
|
||||||
stdio: ['ignore', 'pipe', 'pipe']
|
|
||||||
}
|
|
||||||
).trim();
|
|
||||||
|
|
||||||
if (!output) return [];
|
if (!output) return [];
|
||||||
|
|
||||||
|
|||||||
@@ -33,10 +33,11 @@ export function resolveLocalId(path: string, existingIds: string[]): string {
|
|||||||
* Slugify a string to be safe for use in IDs.
|
* Slugify a string to be safe for use in IDs.
|
||||||
*/
|
*/
|
||||||
function slugify(str: string): string {
|
function slugify(str: string): string {
|
||||||
return str
|
return (
|
||||||
.toLowerCase()
|
str
|
||||||
.replace(/[^a-z0-9-_]/g, '-')
|
.toLowerCase()
|
||||||
.replace(/-+/g, '-')
|
.replace(/[^a-z0-9-_]/g, '-')
|
||||||
.replace(/^-|-$/g, '')
|
.replace(/-+/g, '-')
|
||||||
|| 'repo';
|
.replace(/^-|-$/g, '') || 'repo'
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -59,13 +59,10 @@ export function errorResponse(
|
|||||||
status: number,
|
status: number,
|
||||||
details?: Record<string, unknown>
|
details?: Record<string, unknown>
|
||||||
): Response {
|
): Response {
|
||||||
return new Response(
|
return new Response(JSON.stringify({ error, code, ...(details ? { details } : {}) }), {
|
||||||
JSON.stringify({ error, code, ...(details ? { details } : {}) }),
|
status,
|
||||||
{
|
headers: { 'Content-Type': 'application/json' }
|
||||||
status,
|
});
|
||||||
headers: { 'Content-Type': 'application/json' }
|
|
||||||
}
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|||||||
@@ -20,15 +20,9 @@ import { createServer } from 'node:http';
|
|||||||
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
|
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
|
||||||
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
||||||
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
|
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
|
||||||
import {
|
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
|
||||||
CallToolRequestSchema,
|
|
||||||
ListToolsRequestSchema
|
|
||||||
} from '@modelcontextprotocol/sdk/types.js';
|
|
||||||
|
|
||||||
import {
|
import { RESOLVE_LIBRARY_ID_TOOL, handleResolveLibraryId } from './tools/resolve-library-id.js';
|
||||||
RESOLVE_LIBRARY_ID_TOOL,
|
|
||||||
handleResolveLibraryId
|
|
||||||
} from './tools/resolve-library-id.js';
|
|
||||||
import { QUERY_DOCS_TOOL, handleQueryDocs } from './tools/query-docs.js';
|
import { QUERY_DOCS_TOOL, handleQueryDocs } from './tools/query-docs.js';
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -9,7 +9,9 @@ import { z } from 'zod';
|
|||||||
import { searchLibraries } from '../client.js';
|
import { searchLibraries } from '../client.js';
|
||||||
|
|
||||||
export const ResolveLibraryIdSchema = z.object({
|
export const ResolveLibraryIdSchema = z.object({
|
||||||
libraryName: z.string().describe('Library name to search for and resolve to a TrueRef library ID'),
|
libraryName: z
|
||||||
|
.string()
|
||||||
|
.describe('Library name to search for and resolve to a TrueRef library ID'),
|
||||||
query: z.string().describe("The user's question or context to help rank results")
|
query: z.string().describe("The user's question or context to help rank results")
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|||||||
@@ -150,7 +150,9 @@
|
|||||||
{#if loading && jobs.length === 0}
|
{#if loading && jobs.length === 0}
|
||||||
<div class="flex items-center justify-center py-12">
|
<div class="flex items-center justify-center py-12">
|
||||||
<div class="text-center">
|
<div class="text-center">
|
||||||
<div class="inline-block h-8 w-8 animate-spin rounded-full border-4 border-solid border-blue-600 border-r-transparent"></div>
|
<div
|
||||||
|
class="inline-block h-8 w-8 animate-spin rounded-full border-4 border-solid border-blue-600 border-r-transparent"
|
||||||
|
></div>
|
||||||
<p class="mt-2 text-gray-600">Loading jobs...</p>
|
<p class="mt-2 text-gray-600">Loading jobs...</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
@@ -160,26 +162,38 @@
|
|||||||
</div>
|
</div>
|
||||||
{:else if jobs.length === 0}
|
{:else if jobs.length === 0}
|
||||||
<div class="rounded-md bg-gray-50 p-8 text-center">
|
<div class="rounded-md bg-gray-50 p-8 text-center">
|
||||||
<p class="text-gray-600">No jobs found. Jobs will appear here when repositories are indexed.</p>
|
<p class="text-gray-600">
|
||||||
|
No jobs found. Jobs will appear here when repositories are indexed.
|
||||||
|
</p>
|
||||||
</div>
|
</div>
|
||||||
{:else}
|
{:else}
|
||||||
<div class="overflow-x-auto rounded-lg border border-gray-200 bg-white shadow">
|
<div class="overflow-x-auto rounded-lg border border-gray-200 bg-white shadow">
|
||||||
<table class="min-w-full divide-y divide-gray-200">
|
<table class="min-w-full divide-y divide-gray-200">
|
||||||
<thead class="bg-gray-50">
|
<thead class="bg-gray-50">
|
||||||
<tr>
|
<tr>
|
||||||
<th class="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider text-gray-500">
|
<th
|
||||||
|
class="px-6 py-3 text-left text-xs font-medium tracking-wider text-gray-500 uppercase"
|
||||||
|
>
|
||||||
Repository
|
Repository
|
||||||
</th>
|
</th>
|
||||||
<th class="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider text-gray-500">
|
<th
|
||||||
|
class="px-6 py-3 text-left text-xs font-medium tracking-wider text-gray-500 uppercase"
|
||||||
|
>
|
||||||
Status
|
Status
|
||||||
</th>
|
</th>
|
||||||
<th class="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider text-gray-500">
|
<th
|
||||||
|
class="px-6 py-3 text-left text-xs font-medium tracking-wider text-gray-500 uppercase"
|
||||||
|
>
|
||||||
Progress
|
Progress
|
||||||
</th>
|
</th>
|
||||||
<th class="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider text-gray-500">
|
<th
|
||||||
|
class="px-6 py-3 text-left text-xs font-medium tracking-wider text-gray-500 uppercase"
|
||||||
|
>
|
||||||
Created
|
Created
|
||||||
</th>
|
</th>
|
||||||
<th class="px-6 py-3 text-right text-xs font-medium uppercase tracking-wider text-gray-500">
|
<th
|
||||||
|
class="px-6 py-3 text-right text-xs font-medium tracking-wider text-gray-500 uppercase"
|
||||||
|
>
|
||||||
Actions
|
Actions
|
||||||
</th>
|
</th>
|
||||||
</tr>
|
</tr>
|
||||||
@@ -187,16 +201,16 @@
|
|||||||
<tbody class="divide-y divide-gray-200 bg-white">
|
<tbody class="divide-y divide-gray-200 bg-white">
|
||||||
{#each jobs as job (job.id)}
|
{#each jobs as job (job.id)}
|
||||||
<tr class="hover:bg-gray-50">
|
<tr class="hover:bg-gray-50">
|
||||||
<td class="whitespace-nowrap px-6 py-4 text-sm font-medium text-gray-900">
|
<td class="px-6 py-4 text-sm font-medium whitespace-nowrap text-gray-900">
|
||||||
{job.repositoryId}
|
{job.repositoryId}
|
||||||
{#if job.versionId}
|
{#if job.versionId}
|
||||||
<span class="ml-1 text-xs text-gray-500">@{job.versionId}</span>
|
<span class="ml-1 text-xs text-gray-500">@{job.versionId}</span>
|
||||||
{/if}
|
{/if}
|
||||||
</td>
|
</td>
|
||||||
<td class="whitespace-nowrap px-6 py-4 text-sm text-gray-500">
|
<td class="px-6 py-4 text-sm whitespace-nowrap text-gray-500">
|
||||||
<JobStatusBadge status={job.status} />
|
<JobStatusBadge status={job.status} />
|
||||||
</td>
|
</td>
|
||||||
<td class="whitespace-nowrap px-6 py-4 text-sm text-gray-500">
|
<td class="px-6 py-4 text-sm whitespace-nowrap text-gray-500">
|
||||||
<div class="flex items-center">
|
<div class="flex items-center">
|
||||||
<span class="mr-2">{job.progress}%</span>
|
<span class="mr-2">{job.progress}%</span>
|
||||||
<div class="h-2 w-32 rounded-full bg-gray-200">
|
<div class="h-2 w-32 rounded-full bg-gray-200">
|
||||||
@@ -212,10 +226,10 @@
|
|||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
</td>
|
</td>
|
||||||
<td class="whitespace-nowrap px-6 py-4 text-sm text-gray-500">
|
<td class="px-6 py-4 text-sm whitespace-nowrap text-gray-500">
|
||||||
{formatDate(job.createdAt)}
|
{formatDate(job.createdAt)}
|
||||||
</td>
|
</td>
|
||||||
<td class="whitespace-nowrap px-6 py-4 text-right text-sm font-medium">
|
<td class="px-6 py-4 text-right text-sm font-medium whitespace-nowrap">
|
||||||
<div class="flex justify-end gap-2">
|
<div class="flex justify-end gap-2">
|
||||||
{#if canPause(job.status)}
|
{#if canPause(job.status)}
|
||||||
<button
|
<button
|
||||||
@@ -256,9 +270,7 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
{#if loading}
|
{#if loading}
|
||||||
<div class="mt-4 text-center text-sm text-gray-500">
|
<div class="mt-4 text-center text-sm text-gray-500">Refreshing...</div>
|
||||||
Refreshing...
|
|
||||||
</div>
|
|
||||||
{/if}
|
{/if}
|
||||||
{/if}
|
{/if}
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -20,11 +20,7 @@ import { createProviderFromProfile } from '$lib/server/embeddings/registry';
|
|||||||
import type { EmbeddingProfile } from '$lib/server/db/schema';
|
import type { EmbeddingProfile } from '$lib/server/db/schema';
|
||||||
import { parseLibraryId } from '$lib/server/api/library-id';
|
import { parseLibraryId } from '$lib/server/api/library-id';
|
||||||
import { selectSnippetsWithinBudget, DEFAULT_TOKEN_BUDGET } from '$lib/server/api/token-budget';
|
import { selectSnippetsWithinBudget, DEFAULT_TOKEN_BUDGET } from '$lib/server/api/token-budget';
|
||||||
import {
|
import { formatContextJson, formatContextTxt, CORS_HEADERS } from '$lib/server/api/formatters';
|
||||||
formatContextJson,
|
|
||||||
formatContextTxt,
|
|
||||||
CORS_HEADERS
|
|
||||||
} from '$lib/server/api/formatters';
|
|
||||||
import type { ContextResponseMetadata } from '$lib/server/mappers/context-response.mapper';
|
import type { ContextResponseMetadata } from '$lib/server/mappers/context-response.mapper';
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
@@ -36,9 +32,10 @@ function getServices(db: ReturnType<typeof getClient>) {
|
|||||||
|
|
||||||
// Load the active embedding profile from the database
|
// Load the active embedding profile from the database
|
||||||
const profileRow = db
|
const profileRow = db
|
||||||
.prepare<[], EmbeddingProfile>(
|
.prepare<
|
||||||
'SELECT * FROM embedding_profiles WHERE is_default = 1 AND enabled = 1 LIMIT 1'
|
[],
|
||||||
)
|
EmbeddingProfile
|
||||||
|
>('SELECT * FROM embedding_profiles WHERE is_default = 1 AND enabled = 1 LIMIT 1')
|
||||||
.get();
|
.get();
|
||||||
|
|
||||||
const provider = profileRow ? createProviderFromProfile(profileRow) : null;
|
const provider = profileRow ? createProviderFromProfile(profileRow) : null;
|
||||||
@@ -53,7 +50,10 @@ interface RawRepoConfig {
|
|||||||
|
|
||||||
function getRules(db: ReturnType<typeof getClient>, repositoryId: string): string[] {
|
function getRules(db: ReturnType<typeof getClient>, repositoryId: string): string[] {
|
||||||
const row = db
|
const row = db
|
||||||
.prepare<[string], RawRepoConfig>(`SELECT rules FROM repository_configs WHERE repository_id = ?`)
|
.prepare<
|
||||||
|
[string],
|
||||||
|
RawRepoConfig
|
||||||
|
>(`SELECT rules FROM repository_configs WHERE repository_id = ?`)
|
||||||
.get(repositoryId);
|
.get(repositoryId);
|
||||||
|
|
||||||
if (!row?.rules) return [];
|
if (!row?.rules) return [];
|
||||||
@@ -88,9 +88,10 @@ function getSnippetVersionTags(
|
|||||||
|
|
||||||
const placeholders = versionIds.map(() => '?').join(', ');
|
const placeholders = versionIds.map(() => '?').join(', ');
|
||||||
const rows = db
|
const rows = db
|
||||||
.prepare<string[], RawVersionRow>(
|
.prepare<
|
||||||
`SELECT id, tag FROM repository_versions WHERE id IN (${placeholders})`
|
string[],
|
||||||
)
|
RawVersionRow
|
||||||
|
>(`SELECT id, tag FROM repository_versions WHERE id IN (${placeholders})`)
|
||||||
.all(...versionIds);
|
.all(...versionIds);
|
||||||
|
|
||||||
return Object.fromEntries(rows.map((row) => [row.id, row.tag]));
|
return Object.fromEntries(rows.map((row) => [row.id, row.tag]));
|
||||||
@@ -116,13 +117,10 @@ export const GET: RequestHandler = async ({ url }) => {
|
|||||||
const query = url.searchParams.get('query');
|
const query = url.searchParams.get('query');
|
||||||
|
|
||||||
if (!query || !query.trim()) {
|
if (!query || !query.trim()) {
|
||||||
return new Response(
|
return new Response(JSON.stringify({ error: 'query is required', code: 'MISSING_PARAMETER' }), {
|
||||||
JSON.stringify({ error: 'query is required', code: 'MISSING_PARAMETER' }),
|
status: 400,
|
||||||
{
|
headers: { 'Content-Type': 'application/json', ...CORS_HEADERS }
|
||||||
status: 400,
|
});
|
||||||
headers: { 'Content-Type': 'application/json', ...CORS_HEADERS }
|
|
||||||
}
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const responseType = url.searchParams.get('type') ?? 'json';
|
const responseType = url.searchParams.get('type') ?? 'json';
|
||||||
@@ -157,9 +155,10 @@ export const GET: RequestHandler = async ({ url }) => {
|
|||||||
|
|
||||||
// Verify the repository exists and check its state.
|
// Verify the repository exists and check its state.
|
||||||
const repo = db
|
const repo = db
|
||||||
.prepare<[string], RawRepoState>(
|
.prepare<
|
||||||
`SELECT id, state, title, source, source_url, branch FROM repositories WHERE id = ?`
|
[string],
|
||||||
)
|
RawRepoState
|
||||||
|
>(`SELECT id, state, title, source, source_url, branch FROM repositories WHERE id = ?`)
|
||||||
.get(parsed.repositoryId);
|
.get(parsed.repositoryId);
|
||||||
|
|
||||||
if (!repo) {
|
if (!repo) {
|
||||||
@@ -193,9 +192,10 @@ export const GET: RequestHandler = async ({ url }) => {
|
|||||||
let resolvedVersion: RawVersionRow | undefined;
|
let resolvedVersion: RawVersionRow | undefined;
|
||||||
if (parsed.version) {
|
if (parsed.version) {
|
||||||
resolvedVersion = db
|
resolvedVersion = db
|
||||||
.prepare<[string, string], RawVersionRow>(
|
.prepare<
|
||||||
`SELECT id, tag FROM repository_versions WHERE repository_id = ? AND tag = ?`
|
[string, string],
|
||||||
)
|
RawVersionRow
|
||||||
|
>(`SELECT id, tag FROM repository_versions WHERE repository_id = ? AND tag = ?`)
|
||||||
.get(parsed.repositoryId, parsed.version);
|
.get(parsed.repositoryId, parsed.version);
|
||||||
|
|
||||||
// Version not found is not fatal — fall back to default branch.
|
// Version not found is not fatal — fall back to default branch.
|
||||||
@@ -240,13 +240,14 @@ export const GET: RequestHandler = async ({ url }) => {
|
|||||||
sourceUrl: repo.source_url,
|
sourceUrl: repo.source_url,
|
||||||
branch: repo.branch
|
branch: repo.branch
|
||||||
},
|
},
|
||||||
version: parsed.version || resolvedVersion
|
version:
|
||||||
? {
|
parsed.version || resolvedVersion
|
||||||
requested: parsed.version ?? null,
|
? {
|
||||||
resolved: resolvedVersion?.tag ?? null,
|
requested: parsed.version ?? null,
|
||||||
id: resolvedVersion?.id ?? null
|
resolved: resolvedVersion?.tag ?? null,
|
||||||
}
|
id: resolvedVersion?.id ?? null
|
||||||
: null,
|
}
|
||||||
|
: null,
|
||||||
snippetVersions
|
snippetVersions
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|||||||
@@ -10,7 +10,7 @@ export const GET: RequestHandler = ({ url }) => {
|
|||||||
|
|
||||||
let entries: { name: string; path: string; isGitRepo: boolean }[] = [];
|
let entries: { name: string; path: string; isGitRepo: boolean }[] = [];
|
||||||
let error: string | null = null;
|
let error: string | null = null;
|
||||||
let resolved = target;
|
const resolved = target;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const items = fs.readdirSync(target, { withFileTypes: true });
|
const items = fs.readdirSync(target, { withFileTypes: true });
|
||||||
|
|||||||
@@ -7,7 +7,11 @@ import type { RequestHandler } from './$types';
|
|||||||
import { getClient } from '$lib/server/db/client.js';
|
import { getClient } from '$lib/server/db/client.js';
|
||||||
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
||||||
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
||||||
import { handleServiceError, NotFoundError, InvalidInputError } from '$lib/server/utils/validation.js';
|
import {
|
||||||
|
handleServiceError,
|
||||||
|
NotFoundError,
|
||||||
|
InvalidInputError
|
||||||
|
} from '$lib/server/utils/validation.js';
|
||||||
|
|
||||||
export const POST: RequestHandler = ({ params }) => {
|
export const POST: RequestHandler = ({ params }) => {
|
||||||
try {
|
try {
|
||||||
@@ -19,9 +23,7 @@ export const POST: RequestHandler = ({ params }) => {
|
|||||||
|
|
||||||
const success = queue.cancelJob(params.id);
|
const success = queue.cancelJob(params.id);
|
||||||
if (!success) {
|
if (!success) {
|
||||||
throw new InvalidInputError(
|
throw new InvalidInputError(`Cannot cancel job ${params.id} - job is already done or failed`);
|
||||||
`Cannot cancel job ${params.id} - job is already done or failed`
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Fetch updated job
|
// Fetch updated job
|
||||||
|
|||||||
@@ -7,7 +7,11 @@ import type { RequestHandler } from './$types';
|
|||||||
import { getClient } from '$lib/server/db/client.js';
|
import { getClient } from '$lib/server/db/client.js';
|
||||||
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
||||||
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
||||||
import { handleServiceError, NotFoundError, InvalidInputError } from '$lib/server/utils/validation.js';
|
import {
|
||||||
|
handleServiceError,
|
||||||
|
NotFoundError,
|
||||||
|
InvalidInputError
|
||||||
|
} from '$lib/server/utils/validation.js';
|
||||||
|
|
||||||
export const POST: RequestHandler = ({ params }) => {
|
export const POST: RequestHandler = ({ params }) => {
|
||||||
try {
|
try {
|
||||||
|
|||||||
@@ -7,7 +7,11 @@ import type { RequestHandler } from './$types';
|
|||||||
import { getClient } from '$lib/server/db/client.js';
|
import { getClient } from '$lib/server/db/client.js';
|
||||||
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
import { IndexingJobMapper } from '$lib/server/mappers/indexing-job.mapper.js';
|
||||||
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
import { JobQueue } from '$lib/server/pipeline/job-queue.js';
|
||||||
import { handleServiceError, NotFoundError, InvalidInputError } from '$lib/server/utils/validation.js';
|
import {
|
||||||
|
handleServiceError,
|
||||||
|
NotFoundError,
|
||||||
|
InvalidInputError
|
||||||
|
} from '$lib/server/utils/validation.js';
|
||||||
|
|
||||||
export const POST: RequestHandler = ({ params }) => {
|
export const POST: RequestHandler = ({ params }) => {
|
||||||
try {
|
try {
|
||||||
@@ -19,7 +23,9 @@ export const POST: RequestHandler = ({ params }) => {
|
|||||||
|
|
||||||
const success = queue.resumeJob(params.id);
|
const success = queue.resumeJob(params.id);
|
||||||
if (!success) {
|
if (!success) {
|
||||||
throw new InvalidInputError(`Cannot resume job ${params.id} - only paused jobs can be resumed`);
|
throw new InvalidInputError(
|
||||||
|
`Cannot resume job ${params.id} - only paused jobs can be resumed`
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Fetch updated job
|
// Fetch updated job
|
||||||
|
|||||||
@@ -58,9 +58,7 @@ export const POST: RequestHandler = async ({ request }) => {
|
|||||||
let jobResponse: ReturnType<typeof IndexingJobMapper.toDto> | null = null;
|
let jobResponse: ReturnType<typeof IndexingJobMapper.toDto> | null = null;
|
||||||
if (body.autoIndex !== false) {
|
if (body.autoIndex !== false) {
|
||||||
const queue = getQueue();
|
const queue = getQueue();
|
||||||
const job = queue
|
const job = queue ? queue.enqueue(repo.id) : service.createIndexingJob(repo.id);
|
||||||
? queue.enqueue(repo.id)
|
|
||||||
: service.createIndexingJob(repo.id);
|
|
||||||
jobResponse = IndexingJobMapper.toDto(job);
|
jobResponse = IndexingJobMapper.toDto(job);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -28,9 +28,7 @@ export const POST: RequestHandler = async ({ params, request }) => {
|
|||||||
// Use the queue so processNext() is triggered immediately.
|
// Use the queue so processNext() is triggered immediately.
|
||||||
// Falls back to direct DB insert if the queue isn't initialised yet.
|
// Falls back to direct DB insert if the queue isn't initialised yet.
|
||||||
const queue = getQueue();
|
const queue = getQueue();
|
||||||
const job = queue
|
const job = queue ? queue.enqueue(id, versionId) : service.createIndexingJob(id, versionId);
|
||||||
? queue.enqueue(id, versionId)
|
|
||||||
: service.createIndexingJob(id, versionId);
|
|
||||||
|
|
||||||
return json({ job: IndexingJobMapper.toDto(job) }, { status: 202 });
|
return json({ job: IndexingJobMapper.toDto(job) }, { status: 202 });
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
|
|||||||
@@ -146,4 +146,3 @@ function sanitizeProfile(profile: EmbeddingProfile): EmbeddingProfile {
|
|||||||
}
|
}
|
||||||
return profile;
|
return profile;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -16,9 +16,10 @@ export const GET: RequestHandler = async () => {
|
|||||||
try {
|
try {
|
||||||
const db = getClient();
|
const db = getClient();
|
||||||
const profile = db
|
const profile = db
|
||||||
.prepare<[], EmbeddingProfile>(
|
.prepare<
|
||||||
'SELECT * FROM embedding_profiles WHERE is_default = 1 AND enabled = 1 LIMIT 1'
|
[],
|
||||||
)
|
EmbeddingProfile
|
||||||
|
>('SELECT * FROM embedding_profiles WHERE is_default = 1 AND enabled = 1 LIMIT 1')
|
||||||
.get();
|
.get();
|
||||||
|
|
||||||
if (!profile) {
|
if (!profile) {
|
||||||
@@ -42,7 +43,6 @@ export const GET: RequestHandler = async () => {
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
export const POST: RequestHandler = async ({ request }) => {
|
export const POST: RequestHandler = async ({ request }) => {
|
||||||
try {
|
try {
|
||||||
const body = await request.json();
|
const body = await request.json();
|
||||||
|
|||||||
@@ -8,7 +8,9 @@
|
|||||||
let { data }: { data: PageData } = $props();
|
let { data }: { data: PageData } = $props();
|
||||||
|
|
||||||
// Initialized empty; $effect syncs from data prop on every navigation/reload.
|
// Initialized empty; $effect syncs from data prop on every navigation/reload.
|
||||||
let repo = $state<Repository & { versions?: RepositoryVersion[] }>({} as Repository & { versions?: RepositoryVersion[] });
|
let repo = $state<Repository & { versions?: RepositoryVersion[] }>(
|
||||||
|
{} as Repository & { versions?: RepositoryVersion[] }
|
||||||
|
);
|
||||||
let recentJobs = $state<IndexingJob[]>([]);
|
let recentJobs = $state<IndexingJob[]>([]);
|
||||||
$effect(() => {
|
$effect(() => {
|
||||||
if (data.repo) repo = data.repo;
|
if (data.repo) repo = data.repo;
|
||||||
@@ -189,7 +191,7 @@
|
|||||||
<dl class="grid grid-cols-1 gap-y-2 text-sm sm:grid-cols-2">
|
<dl class="grid grid-cols-1 gap-y-2 text-sm sm:grid-cols-2">
|
||||||
<div class="flex gap-2">
|
<div class="flex gap-2">
|
||||||
<dt class="text-gray-500">Source</dt>
|
<dt class="text-gray-500">Source</dt>
|
||||||
<dd class="font-medium capitalize text-gray-900">{repo.source}</dd>
|
<dd class="font-medium text-gray-900 capitalize">{repo.source}</dd>
|
||||||
</div>
|
</div>
|
||||||
<div class="flex gap-2">
|
<div class="flex gap-2">
|
||||||
<dt class="text-gray-500">Branch</dt>
|
<dt class="text-gray-500">Branch</dt>
|
||||||
|
|||||||
@@ -227,7 +227,9 @@
|
|||||||
<div class="rounded-xl border border-gray-200 bg-white p-6 shadow-sm">
|
<div class="rounded-xl border border-gray-200 bg-white p-6 shadow-sm">
|
||||||
<div class="mb-4 flex items-center gap-3">
|
<div class="mb-4 flex items-center gap-3">
|
||||||
<div class="flex min-w-0 flex-1 items-center gap-2">
|
<div class="flex min-w-0 flex-1 items-center gap-2">
|
||||||
<span class="shrink-0 rounded bg-green-100 px-2 py-0.5 text-xs font-medium text-green-700">
|
<span
|
||||||
|
class="shrink-0 rounded bg-green-100 px-2 py-0.5 text-xs font-medium text-green-700"
|
||||||
|
>
|
||||||
Selected
|
Selected
|
||||||
</span>
|
</span>
|
||||||
<span class="truncate font-mono text-sm text-gray-700">{selectedLibraryTitle}</span>
|
<span class="truncate font-mono text-sm text-gray-700">{selectedLibraryTitle}</span>
|
||||||
@@ -285,7 +287,9 @@
|
|||||||
{:else if query && !loadingSnippets && snippets.length === 0 && !snippetError}
|
{:else if query && !loadingSnippets && snippets.length === 0 && !snippetError}
|
||||||
<div class="flex flex-col items-center py-16 text-center">
|
<div class="flex flex-col items-center py-16 text-center">
|
||||||
<p class="text-sm text-gray-500">No snippets found for that query.</p>
|
<p class="text-sm text-gray-500">No snippets found for that query.</p>
|
||||||
<p class="mt-1 text-xs text-gray-400">Try a different question or select another library.</p>
|
<p class="mt-1 text-xs text-gray-400">
|
||||||
|
Try a different question or select another library.
|
||||||
|
</p>
|
||||||
</div>
|
</div>
|
||||||
{/if}
|
{/if}
|
||||||
{/if}
|
{/if}
|
||||||
|
|||||||
@@ -203,7 +203,11 @@
|
|||||||
: 'border border-gray-200 text-gray-700 hover:bg-gray-50'
|
: 'border border-gray-200 text-gray-700 hover:bg-gray-50'
|
||||||
].join(' ')}
|
].join(' ')}
|
||||||
>
|
>
|
||||||
{p === 'none' ? 'None (FTS5 only)' : p === 'openai' ? 'OpenAI-compatible' : 'Local Model'}
|
{p === 'none'
|
||||||
|
? 'None (FTS5 only)'
|
||||||
|
: p === 'openai'
|
||||||
|
? 'OpenAI-compatible'
|
||||||
|
: 'Local Model'}
|
||||||
</button>
|
</button>
|
||||||
{/each}
|
{/each}
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
Reference in New Issue
Block a user