# Code Style This document describes the conventions used in this codebase. Follow them when adding or modifying code. --- ## Language & Edition - Rust 2021 edition - Stable toolchain only — no nightly features - `rustfmt` is installed in the Docker builder; run it before committing --- ## Error Handling ### Use `AppError` everywhere All fallible functions in the HTTP layer return `crate::Result` which is `std::result::Result`. ```rust // Good pub async fn get_job(...) -> Result> { let job = state.storage.get(&id).await?; Ok(Json(job)) } // Bad — don't use anyhow inside handlers pub async fn get_job(...) -> anyhow::Result<...> { ... } ``` `AppError` has four variants that map cleanly to HTTP status codes: | Variant | HTTP | |---------|------| | `NotFound(String)` | 404 | | `BadRequest(String)` | 400 | | `Conflict(String)` | 409 | | `Internal(String)` | 500 | ### `anyhow` is only for `main()` `anyhow::Result` is used in `main()` for startup failures where a single error type is not needed. It must not leak into handler or worker code. ### Propagate with `?`, map errors explicitly Prefer `.map_err(|e| AppError::Internal(format!("context: {e}")))` over `.unwrap()` or `.expect()` in any code path that can be reached at runtime. ```rust // Good fs::write(&path, payload).await.map_err(|e| { AppError::Internal(format!("failed to write job {}: {e}", job.id)) })?; // Bad fs::write(&path, payload).await.unwrap(); ``` --- ## Async & Concurrency ### Tokio for I/O, std thread for GPU - All file I/O, HTTP, ffmpeg subprocess calls: use `tokio::fs`, `tokio::process::Command` - GPU inference runs on a dedicated `std::thread` — never spawn a `tokio::task` that blocks on `WhisperContext` ### Don't block the async runtime Long CPU-bound or blocking operations must not run on Tokio tasks. Currently the only such operation is whisper inference, which lives on the `whisper-gpu` OS thread and communicates back via `oneshot` channels. ### Channel conventions | Channel type | Used for | |-------------|---------| | `tokio::sync::mpsc::UnboundedSender` | HTTP → worker: new job notifications | | `std::sync::mpsc::Sender` | Worker → GPU thread: inference requests | | `tokio::sync::oneshot` | GPU thread → worker: inference result | | `tokio::sync::broadcast` | Worker → SSE subscribers: progress events | --- ## State Management `AppState` is `Clone` and passed via Axum's `State` extractor. All fields are either `Arc` or cheap-to-clone types (`mpsc::UnboundedSender` is `Clone`). ```rust #[derive(Clone)] pub struct AppState { pub job_tx: mpsc::UnboundedSender, // cheap clone pub storage: Arc, // shared ref pub progress: worker::ProgressRegistry, // Arc pub model_name: Arc, // shared ref pub queue_depth: Arc, // shared atomic pub gpu_device: u32, // copy } ``` Never put `Mutex` in `AppState` for hot paths. Prefer atomics (`AtomicUsize`) for counters and `DashMap` for concurrent maps. --- ## Naming - Types: `UpperCamelCase` - Functions / methods / locals: `snake_case` - Constants: `SCREAMING_SNAKE_CASE` - Module files match the concept they contain (`storage.rs`, `transcriber.rs`) ```rust const TARGET_CHUNK_SECS: f32 = 60.0; const SNAP_WINDOW_SECS: f32 = 30.0; ``` Align related constant definitions vertically with spaces — the compiler doesn't care and it's significantly easier to scan. --- ## Comments Comment *why*, not *what*. The code shows what; the comment explains decisions, non-obvious constraints, and gotchas. ```rust // Good — explains the constraint // no_context: do not use previous call's transcript as initial prompt. // Each silence-chunked audio segment is independent; cross-chunk context // would re-anchor the decoder to any hallucinations from the prior chunk. fp.set_no_context(true); // Bad — restates the code // set no_context to true fp.set_context(true); ``` For **disabled code** (commented-out functionality): always include a reason. ```rust // Flash Attention disabled: causes silent 0-segment output on some // real-world audio (conference recordings, noisy MP3s). // params.flash_attn(true); ``` --- ## Logging Use `tracing` macros at the appropriate level: | Level | When | |-------|------| | `error!` | Unrecoverable failure; always investigated | | `warn!` | Recoverable unexpected state (e.g. ffmpeg unavailable, job not found at dequeue) | | `info!` | Significant lifecycle events (server start, model load, job queued/done) | | `debug!` | Per-chunk detail, useful when diagnosing a specific job | | `trace!` | Very high-frequency detail (sample-level, buffer operations) | Always use structured fields, not string interpolation, so logs are machine-parseable (the subscriber emits JSON): ```rust // Good tracing::info!(job_id = %id, model = path, "job queued"); // Bad tracing::info!("job {} queued, model: {}", id, path); ``` --- ## Serialisation - All public API types derive `Serialize`/`Deserialize` from `serde` - Use `#[serde(rename_all = "snake_case")]` on enums used in JSON responses - Use `#[serde(skip_serializing_if = "Option::is_none")]` on optional fields to keep the JSON clean - Timestamps are `chrono::DateTime` — they serialise to ISO 8601 automatically --- ## OpenAPI Annotations All route handlers are annotated with `#[utoipa::path(...)]`. Keep the annotations up to date when adding or changing endpoints. The Swagger UI is served at `/docs` in development. --- ## File Structure Conventions ``` src/ main.rs — entry point only; no business logic models.rs — pure data types; no logic error.rs — AppError + IntoResponse storage.rs — I/O only; no business logic transcriber.rs — whisper inference only worker.rs — orchestration: audio pipeline, job lifecycle webhook.rs — HTTP delivery only routes/ mod.rs — router assembly only jobs.rs — HTTP handlers for job endpoints health.rs — HTTP handler for /health ``` Keep each file focused on a single concern. Business logic lives in `worker.rs`. Data shapes live in `models.rs`. Avoid cross-cutting imports in the wrong direction (e.g. `storage.rs` must not import from `worker.rs`). --- ## Release Profile ```toml [profile.release] opt-level = 3 lto = "thin" codegen-units = 1 strip = "symbols" ``` - `lto = "thin"` is a good balance between link time and binary size/performance - `codegen-units = 1` maximises inlining across crate boundaries — important for hot inference paths - `strip = "symbols"` keeps the Docker runtime image small; debug info is not needed in production