Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6.8 KiB
Code Style
This document describes the conventions used in this codebase. Follow them when adding or modifying code.
Language & Edition
- Rust 2021 edition
- Stable toolchain only — no nightly features
rustfmtis installed in the Docker builder; run it before committing
Error Handling
Use AppError everywhere
All fallible functions in the HTTP layer return crate::Result<T> which is std::result::Result<T, AppError>.
// Good
pub async fn get_job(...) -> Result<Json<Job>> {
let job = state.storage.get(&id).await?;
Ok(Json(job))
}
// Bad — don't use anyhow inside handlers
pub async fn get_job(...) -> anyhow::Result<...> { ... }
AppError has four variants that map cleanly to HTTP status codes:
| Variant | HTTP |
|---|---|
NotFound(String) |
404 |
BadRequest(String) |
400 |
Conflict(String) |
409 |
Internal(String) |
500 |
anyhow is only for main()
anyhow::Result is used in main() for startup failures where a single error type is not needed. It must not leak into handler or worker code.
Propagate with ?, map errors explicitly
Prefer .map_err(|e| AppError::Internal(format!("context: {e}"))) over .unwrap() or .expect() in any code path that can be reached at runtime.
// Good
fs::write(&path, payload).await.map_err(|e| {
AppError::Internal(format!("failed to write job {}: {e}", job.id))
})?;
// Bad
fs::write(&path, payload).await.unwrap();
Async & Concurrency
Tokio for I/O, std thread for GPU
- All file I/O, HTTP, ffmpeg subprocess calls: use
tokio::fs,tokio::process::Command - GPU inference runs on a dedicated
std::thread— never spawn atokio::taskthat blocks onWhisperContext
Don't block the async runtime
Long CPU-bound or blocking operations must not run on Tokio tasks. Currently the only such operation is whisper inference, which lives on the whisper-gpu OS thread and communicates back via oneshot channels.
Channel conventions
| Channel type | Used for |
|---|---|
tokio::sync::mpsc::UnboundedSender<JobId> |
HTTP → worker: new job notifications |
std::sync::mpsc::Sender<TranscribeRequest> |
Worker → GPU thread: inference requests |
tokio::sync::oneshot |
GPU thread → worker: inference result |
tokio::sync::broadcast |
Worker → SSE subscribers: progress events |
State Management
AppState is Clone and passed via Axum's State extractor. All fields are either Arc<T> or cheap-to-clone types (mpsc::UnboundedSender is Clone).
#[derive(Clone)]
pub struct AppState {
pub job_tx: mpsc::UnboundedSender<models::JobId>, // cheap clone
pub storage: Arc<storage::Storage>, // shared ref
pub progress: worker::ProgressRegistry, // Arc<DashMap>
pub model_name: Arc<str>, // shared ref
pub queue_depth: Arc<std::sync::atomic::AtomicUsize>, // shared atomic
pub gpu_device: u32, // copy
}
Never put Mutex<T> in AppState for hot paths. Prefer atomics (AtomicUsize) for counters and DashMap for concurrent maps.
Naming
- Types:
UpperCamelCase - Functions / methods / locals:
snake_case - Constants:
SCREAMING_SNAKE_CASE - Module files match the concept they contain (
storage.rs,transcriber.rs)
const TARGET_CHUNK_SECS: f32 = 60.0;
const SNAP_WINDOW_SECS: f32 = 30.0;
Align related constant definitions vertically with spaces — the compiler doesn't care and it's significantly easier to scan.
Comments
Comment why, not what. The code shows what; the comment explains decisions, non-obvious constraints, and gotchas.
// Good — explains the constraint
// no_context: do not use previous call's transcript as initial prompt.
// Each silence-chunked audio segment is independent; cross-chunk context
// would re-anchor the decoder to any hallucinations from the prior chunk.
fp.set_no_context(true);
// Bad — restates the code
// set no_context to true
fp.set_context(true);
For disabled code (commented-out functionality): always include a reason.
// Flash Attention disabled: causes silent 0-segment output on some
// real-world audio (conference recordings, noisy MP3s).
// params.flash_attn(true);
Logging
Use tracing macros at the appropriate level:
| Level | When |
|---|---|
error! |
Unrecoverable failure; always investigated |
warn! |
Recoverable unexpected state (e.g. ffmpeg unavailable, job not found at dequeue) |
info! |
Significant lifecycle events (server start, model load, job queued/done) |
debug! |
Per-chunk detail, useful when diagnosing a specific job |
trace! |
Very high-frequency detail (sample-level, buffer operations) |
Always use structured fields, not string interpolation, so logs are machine-parseable (the subscriber emits JSON):
// Good
tracing::info!(job_id = %id, model = path, "job queued");
// Bad
tracing::info!("job {} queued, model: {}", id, path);
Serialisation
- All public API types derive
Serialize/Deserializefromserde - Use
#[serde(rename_all = "snake_case")]on enums used in JSON responses - Use
#[serde(skip_serializing_if = "Option::is_none")]on optional fields to keep the JSON clean - Timestamps are
chrono::DateTime<Utc>— they serialise to ISO 8601 automatically
OpenAPI Annotations
All route handlers are annotated with #[utoipa::path(...)]. Keep the annotations up to date when adding or changing endpoints. The Swagger UI is served at /docs in development.
File Structure Conventions
src/
main.rs — entry point only; no business logic
models.rs — pure data types; no logic
error.rs — AppError + IntoResponse
storage.rs — I/O only; no business logic
transcriber.rs — whisper inference only
worker.rs — orchestration: audio pipeline, job lifecycle
webhook.rs — HTTP delivery only
routes/
mod.rs — router assembly only
jobs.rs — HTTP handlers for job endpoints
health.rs — HTTP handler for /health
Keep each file focused on a single concern. Business logic lives in worker.rs. Data shapes live in models.rs. Avoid cross-cutting imports in the wrong direction (e.g. storage.rs must not import from worker.rs).
Release Profile
[profile.release]
opt-level = 3
lto = "thin"
codegen-units = 1
strip = "symbols"
lto = "thin"is a good balance between link time and binary size/performancecodegen-units = 1maximises inlining across crate boundaries — important for hot inference pathsstrip = "symbols"keeps the Docker runtime image small; debug info is not needed in production