Files
llama-cpp/.gitignore
Giancarmine Salucci 4ad296608b Initial commit: tuned multi-model llama.cpp stack
- 5 models: SmolLM3-3B, Gemma4-E2B/E4B, Qwen3-4B, Qwen3.5-9B
- TurboQuant image (FORCE_MMQ): +6-11% free speed on Turing GPUs
- Bigctx profiles (-nkvo KV in RAM): 2-16x context gain
- turbo2 KV: 2x smaller, benchmarked against PPL quality gate
- Per-model env files with justified parameters
- kv_quant_test.sh + cpu_ctx_test.sh benchmark scripts
- docs/FINDINGS.md: surprises, pitfalls, recommendations
- docs/ARCHITECTURE.md: compose + test script design
2026-05-06 15:56:40 +02:00

35 lines
628 B
Plaintext

# Model files — large binaries, download with scripts/download_models.sh
models/*.gguf
models/*.bin
models/*.safetensors
# Benchmark output logs, CSVs, and generated env snapshots — generated, not source
benchmark-results/*.log
benchmark-results/*.csv
benchmark-results/*.txt
benchmark-results/*.env
# Keep the .gitkeep placeholder
!benchmark-results/.gitkeep
# Docker build cache artifacts
.docker/
# Python cache
__pycache__/
*.pyc
*.pyo
.venv/
# Editor / OS artifacts
.DS_Store
Thumbs.db
*.swp
*.swo
*~
.idea/
.vscode/
# Local overrides (never commit secrets or machine-specific tweaks)
.env.local
envs/.env.*.local