3 Commits

Author SHA1 Message Date
moze
5c6085df99 feat: GPU model lazy-load/unload lifecycle management
All checks were successful
Build and publish Docker image / Build and push CPU image (push) Successful in 2m33s
Build and publish Docker image / Build and push GPU image (push) Successful in 3m15s
- Domain: add ModelState, ModelStateEvent, ModelNotReady, ManageModelLifecycle
  (in-port), ModelLoader and ModelStateEventBus (out-ports)
- Application: InMemoryModelStateEventBus; ModelLifecycleService — state
  machine (ReentrantLock), lazy load on first request, idle-timeout auto-unload
  (configurable via trueref.embedding.idle-timeout-seconds, default 300 s),
  job-guard (skips unload while ingestion running), platform-thread CUDA executor
- Adapters: OnnxModelLoader wires embedder + reranker start/stop; remove
  @PostConstruct/@PreDestroy from OnnxEmbeddingService and OnnxRerankerService;
  requireStarted() now throws ModelNotReady instead of IllegalStateException
- REST: GET /api/model/status, POST /api/model/unload (409 when jobs running,
  force=true to override), GET /api/model/status/stream (SSE)
- GlobalExceptionHandler: ModelNotReady -> 503 + Retry-After header
- HybridSearchService: calls lifecycle.ensureReady() before every search so
  both REST and MCP paths get ModelNotReady (-> 503 / MCP error) when unloaded
- TrueRefMcpTools: catches ModelNotReady, returns retry hint in MCP error text
- Tests: InMemoryModelStateEventBusTest, ModelLifecycleServiceTest (10 cases),
  OnnxModelLoaderTest, GlobalExceptionHandlerTest — all 41 tests green

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-09 15:44:33 +02:00
moze
943a38fd36 fix(mcp): relax library id and name matching
All checks were successful
Build and publish Docker image / Build and push CPU image (push) Successful in 2m11s
Build and publish Docker image / Build and push GPU image (push) Successful in 3m1s
- accept single-segment library ids like /whisper-rtx2080 returned by
  resolve-library-id in get-library-docs
- accept common owner-qualified aliases such as /mozempk/whisper-rtx2080
  when the indexed repo is stored as a single-segment name
- accept single-segment ids with explicit versions such as
  /whisper-rtx2080/v0.0.1
- relax resolve-library-id scoring across separator-only differences so
  queries like whisperrtx2080 still match whisper-rtx2080
- update MCP tool descriptions to document the accepted id formats

Validated with focused regression tests:
- TrueRefMcpToolsTest
- LibraryResolverTest
2026-05-06 10:53:09 +02:00
moze
c5f950c2c0 Initial commit: trueref v0.1.0-SNAPSHOT
Some checks failed
Build and publish Docker image / Build and push (push) Failing after 1m27s
Java 21 / Spring Boot 3.5.3 multi-module Maven project.
Hybrid BM25+HNSW search with RRF, cross-encoder reranker,
ONNX Runtime 1.22.0 (CPU + CUDA 12 GPU variants).
2026-05-06 00:49:16 +02:00