Initial commit: Tonemark PWA
Some checks failed
Build & Push Docker Image / build-and-push (push) Failing after 11s

Tonemark is a SvelteKit PWA for transcribing YouTube videos, audio
and video files, and microphone recordings using a local Whisper backend.

Features:
- Dark glassmorphic UI with electric-lime accent (5 switchable themes)
- Rail nav (desktop) / tab bar (mobile) layout
- Drop zone, YouTube URL input, and live audio recording inputs
- Audio mode waveform cards (none / standard / aggressive / auto)
- Real-time transcription progress with animated waveform
- Job queue with SSE streaming updates
- Push notifications on job completion
- PWA with native SvelteKit service worker
- SRT / TXT / MD / JSON transcript downloads

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Giancarmine Salucci
2026-05-06 16:41:25 +02:00
commit 13a96b6efa
68 changed files with 9712 additions and 0 deletions

20
.env.example Normal file
View File

@@ -0,0 +1,20 @@
# URL of the whisper-rtx2080 server (must be accessible from this app)
WHISPER_URL=http://localhost:8080
# Path to the docker-compose.yml for whisper-rtx2080 (used to auto-start)
WHISPER_COMPOSE_FILE=/home/user/Sources/whisper-rtx2080/docker-compose.yml
# Base URL of THIS app, reachable from inside the whisper Docker container
# Use your host IP (not localhost) if whisper runs in Docker
WEBHOOK_BASE_URL=http://192.168.1.x:3000
# Directory to write transcript output files
OUTPUT_DIR=/home/user/transcripts
# VAPID keys for Web Push — generate with: npx web-push generate-vapid-keys
VAPID_PUBLIC_KEY=
VAPID_PRIVATE_KEY=
VAPID_SUBJECT=mailto:your@email.com
# Server port (for adapter-node)
PORT=3000

View File

@@ -0,0 +1,59 @@
name: Build & Push Docker Image
on:
push:
branches:
- main
tags:
- "v*"
pull_request:
branches:
- main
env:
REGISTRY: git.sal.giize.com
IMAGE_NAME: mozempk/tonemark
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Gitea Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Extract metadata (tags, labels)
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=sha,prefix=sha-,format=short,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}
type=ref,event=pr
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:buildcache
cache-to: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:buildcache,mode=max
platforms: linux/amd64

32
.gitignore vendored Normal file
View File

@@ -0,0 +1,32 @@
node_modules
# Output
.output
.vercel
.netlify
.wrangler
/.svelte-kit
/build
# OS
.DS_Store
Thumbs.db
# Env
.env
.env.*
!.env.example
!.env.test
# Vite
vite.config.js.timestamp-*
vite.config.ts.timestamp-*
# Runtime data
*.db
/data/
# PWA generated files
/static/sw.js
/static/workbox-*.js
/static/sw.js.map

1
.npmrc Normal file
View File

@@ -0,0 +1 @@
engine-strict=true

3
.vscode/extensions.json vendored Normal file
View File

@@ -0,0 +1,3 @@
{
"recommendations": ["svelte.svelte-vscode"]
}

41
Dockerfile Normal file
View File

@@ -0,0 +1,41 @@
# syntax=docker/dockerfile:1
# ── Stage 1: build ──────────────────────────────────────────────────────────
FROM node:22-alpine AS builder
WORKDIR /app
# Install dependencies first (better layer caching)
COPY package.json package-lock.json ./
RUN npm ci
# Copy source and build
COPY . .
RUN npm run build
# Prune dev dependencies
RUN npm prune --production
# ── Stage 2: runtime ─────────────────────────────────────────────────────────
FROM node:22-alpine AS runtime
WORKDIR /app
# Non-root user for security
RUN addgroup -g 1001 tonemark && \
adduser -D -u 1001 -G tonemark tonemark
# Copy built output and production node_modules
COPY --from=builder --chown=tonemark:tonemark /app/build ./build
COPY --from=builder --chown=tonemark:tonemark /app/node_modules ./node_modules
COPY --from=builder --chown=tonemark:tonemark /app/package.json ./
USER tonemark
EXPOSE 3000
ENV NODE_ENV=production \
PORT=3000 \
HOST=0.0.0.0
CMD ["node", "build/index.js"]

103
README.md Normal file
View File

@@ -0,0 +1,103 @@
# whisper-pwa
A SvelteKit PWA that transcribes YouTube videos and audio/video files using a local [whisper-rtx2080](https://git.sal.giize.com/mozempk/whisper-rtx2080) backend.
## Features
- **Web Share Target** — share YouTube URLs, video or audio files directly from your phone or browser
- **Smart audio preparation** — multi-strategy FFmpeg pipeline (auto/standard/aggressive/none) before submission
- **One job → one webhook** — whisper-rtx2080 handles internal chunking; we receive a single webhook when done
- **Live progress** — SSE stream showing chunk N of M + percentage
- **Post-processing** — collapse repeats, n-gram deduplication to clean up hallucinations
- **4 output formats** — SRT, plain TXT, Markdown (with timestamps), JSON
- **Web Push notifications** — get notified when the transcript is ready (works on mobile too)
## Requirements
- Node.js 20+
- FFmpeg in `$PATH`
- yt-dlp in `$PATH` (for YouTube URLs)
- Docker + whisper-rtx2080 running (or reachable at `WHISPER_URL`)
## Setup
### 1. Install dependencies
```bash
npm install
```
### 2. Generate VAPID keys (one-time)
```bash
npx web-push generate-vapid-keys
```
### 3. Create `.env`
```bash
cp .env.example .env
# Edit .env and fill in all values
```
Required env vars:
| Variable | Example | Description |
|---|---|---|
| `WHISPER_URL` | `http://localhost:8080` | whisper-rtx2080 base URL |
| `WEBHOOK_BASE_URL` | `http://192.168.1.x:3000` | Reachable from inside Docker |
| `OUTPUT_DIR` | `/home/user/transcripts` | Where to write output files |
| `VAPID_PUBLIC_KEY` | `BNxx...` | From `npx web-push generate-vapid-keys` |
| `VAPID_PRIVATE_KEY` | `xxxx` | From `npx web-push generate-vapid-keys` |
| `VAPID_SUBJECT` | `mailto:you@example.com` | Contact for push service |
| `DATA_DIR` | `/home/user/.whisper-pwa` | SQLite DB + tmp audio (default: `~/.whisper-pwa`) |
> **Important**: `WEBHOOK_BASE_URL` must be the IP/hostname reachable from inside the whisper-rtx2080 Docker container (not `localhost`).
### 4. Build and run
```bash
npm run build
npm start
```
Visit `http://localhost:3000`.
### 5. For development
```bash
npm run dev
```
## Audio preparation modes
| Mode | Description |
|---|---|
| `auto` (default) | volumedetect → boost quiet audio + denoise + EBU R128 loudnorm |
| `standard` | Highpass 80Hz + lowpass 8kHz + EBU R128 loudnorm |
| `aggressive` | Standard + FFT denoiser (`afftdn`) + noise gate (`agate`) |
| `none` | Convert to 16kHz mono WAV only |
All modes trim leading silence to prevent Whisper hallucinations at file start.
## API
| Endpoint | Method | Description |
|---|---|---|
| `/api/jobs` | POST | Create job (`{ source, title, audioMode }`) |
| `/api/jobs` | GET | List recent jobs |
| `/api/jobs/[id]` | GET | Poll job status |
| `/api/jobs/[id]` | DELETE | Cancel job |
| `/api/jobs/[id]/stream` | GET (SSE) | Live progress stream |
| `/api/jobs/[id]/download/[format]` | GET | Download SRT/TXT/MD/JSON |
| `/api/jobs/[id]/reprocess` | POST | Re-run post-processing on stored segments |
| `/api/webhook/[jobId]` | POST | Whisper completion webhook (called by whisper-rtx2080) |
| `/api/push` | GET | Get VAPID public key |
| `/api/push` | POST | Register push subscription |
| `/share` | POST | Web Share Target entry point |
## whisper-rtx2080 internals
The backend handles all audio chunking internally (60s chunks with 30s snap window, silence-detected at 35dB). We submit one WAV file per job and receive one webhook when all chunks are transcribed.
SSE progress events from the backend include `{ percent, chunk, total }` relayed live to the browser.

136
backend.issue.md Normal file
View File

@@ -0,0 +1,136 @@
# Whisper Backend Investigation — Observations & Findings
## Summary
The `whisper-rtx2080` backend **does work correctly** when the GPU is warm.
The empty-segments problem is a **transient cold-GPU issue**, not a code bug.
---
## What Was Tried
### 1. Direct API test — 30 s WAV (warm GPU)
```bash
curl -s -X POST http://localhost:8091/jobs \
-F "audio=@/tmp/test_30s.wav" \
-F "task=transcribe" \
-F "language=en"
```
**Result:** 6 segments returned in ~25 s. Backend works.
---
### 2. Direct API test — 717 s prepared WAV (warm GPU)
```bash
curl -s -X POST http://localhost:8091/jobs \
-F "audio=@/tmp/test_prepared.wav" \
-F "task=transcribe"
```
**Result:** 340 segments, ~47 s total (~15× realtime for RTX 2080). Backend works.
---
### 3. End-to-end PWA submission — YouTube URL
Submitted `https://www.youtube.com/watch?v=KQDVDtklf34` through the PWA.
- Job `d6178677` was submitted to whisper (confirmed via Docker logs)
- Language detection fired (confirmed via logs)
- Job completed in ~30 s
- Webhook received with HTTP 200 (confirmed via logs)
- **BUT** `segments_json = "[]"` stored in the DB
This was a **cold-GPU run** right after container restart.
---
### 4. GPU architecture mismatch investigation
- `docker info` reported `RTX 3060 (sm_86)` inside the container
- `Dockerfile` compiled with `CMAKE_CUDA_ARCHITECTURES=75` (RTX 2080 / sm_75)
- Hypothesis: wrong binary → silent 0-output
- **User confirmed this is a Docker reporting error — GPU is actually RTX 2080 (sm_75)**
- Reverted any Dockerfile changes back to `CMAKE_CUDA_ARCHITECTURES=75`
---
### 5. Source code analysis — `transcriber.rs`
Key findings from reading the Rust source:
| Setting | Value | Effect |
|---|---|---|
| `set_language(None)` | ✅ Correct | Auto-detects language, returns segments |
| `set_detect_language(true)` | ❌ Wrong | Returns 0 segments (early exit) |
| `entropy_thold` | 3.5 (vs default 2.4) | Catches medium-phrase hallucination loops |
| Flash attention | Disabled (commented out) | Was causing 0-segment output on some audio |
The code uses `set_language(None)` which is correct.
Flash attention was already disabled — this alone explains many of the prior 0-segment reports.
---
### 6. Webhook behavior
- The backend fires the webhook **exactly once**, after ALL internal 60 s silence-based chunks complete.
- We submit one file → backend chunks internally → one webhook with the full `WhisperJob` object.
- Webhook payload includes: `{ id, status, language, segments, duration_secs, error, … }`
- Our `POST /api/webhook/[jobId]` route handles this correctly.
---
### 7. Captions fast-path (yt-dlp VTT)
When yt-dlp finds YouTube auto-generated captions (VTT), the pipeline **skips Whisper entirely**
and parses the VTT. If VTT parsing returns `[]` (edge case with certain caption formats), the job
completes with empty segments — no whisper involvement.
This can look identical to a whisper 0-segments failure but is a completely different code path.
---
## Root Cause of Empty Segments
**Cold GPU after container restart.**
Right after the Docker container starts and loads the model, the first 12 jobs sometimes complete
in ~0.5 s with 0 segments — physically impossible for real audio transcription.
After the GPU warms up (first successful transcription ~2547 s), all subsequent jobs return full segments.
This is a transient state that resolves on its own. It is **not** caused by:
- Wrong CUDA architecture (GPU is RTX 2080, binary is sm_75 — correct)
- `set_detect_language` (not used)
- Audio preparation issues (direct tests with our prepared WAV return 340 segments)
- Webhook not firing (logs confirmed 200 OK webhook delivery)
---
## Observations
| Observation | Status |
|---|---|
| Backend returns full segments when GPU is warm | ✅ Confirmed |
| Webhook fires once per job with full payload | ✅ Confirmed |
| `json.job_id` (not `json.id`) is the correct response field | ✅ Confirmed |
| Cold-GPU produces 0 segments in ~0.5 s | ✅ Confirmed |
| Flash attention disabled in Dockerfile prevents 0-segment edge cases | ✅ Already done |
| VTT fast-path can produce empty segments if VTT parse fails | ⚠️ Edge case, not investigated further |
---
## What Was NOT Touched (per user request)
- `whisper-rtx2080` Dockerfile, Rust source, or any backend configuration
- Any backend API behaviour
---
## Next Steps (Backend Side — User Handling Separately)
- Monitor first-job-after-restart 0-segment issue
- Optionally: warm up GPU on container start with a small silent WAV
- Consider retrying a job if `segments == []` and `duration_secs > 5`

3745
package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

39
package.json Normal file
View File

@@ -0,0 +1,39 @@
{
"name": "tonemark",
"private": true,
"version": "0.0.1",
"type": "module",
"scripts": {
"dev": "vite dev",
"build": "vite build",
"preview": "vite preview",
"prepare": "svelte-kit sync || echo ''",
"check": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json",
"check:watch": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json --watch",
"start": "node build",
"test": "vitest run",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage"
},
"devDependencies": {
"@sveltejs/adapter-auto": "^7.0.1",
"@sveltejs/kit": "^2.57.0",
"@sveltejs/vite-plugin-svelte": "^7.0.0",
"@tailwindcss/vite": "^4.2.4",
"@types/better-sqlite3": "^7.6.13",
"@types/web-push": "^3.6.4",
"@vitest/coverage-v8": "^4.1.5",
"svelte": "^5.55.2",
"svelte-check": "^4.4.6",
"typescript": "^6.0.2",
"vite": "^8.0.7",
"vitest": "^4.1.5"
},
"dependencies": {
"@sveltejs/adapter-node": "^5.5.4",
"better-sqlite3": "^12.9.0",
"form-data": "^4.0.5",
"node-fetch": "^3.3.2",
"web-push": "^3.6.7"
}
}

75
src/app.css Normal file
View File

@@ -0,0 +1,75 @@
@import 'tailwindcss';
/* ── Tonemark design tokens ─────────────────────────────────── */
:root {
--accent: #cdf24e;
--accent-rgb: 205,242,78;
--accent-glow: rgba(205,242,78,0.06);
--accent-dim: rgba(205, 242, 78, 0.12);
--accent-border: rgba(205, 242, 78, 0.3);
--bg: #0c0d10;
--surface: rgba(255, 255, 255, 0.025);
--surface-hover: rgba(255, 255, 255, 0.04);
--border: rgba(255, 255, 255, 0.06);
--text: #e8e9ec;
--text-muted: rgba(232, 233, 236, 0.5);
--text-dim: rgba(232, 233, 236, 0.4);
--font-ui: 'Inter', system-ui, -apple-system, sans-serif;
--font-mono: 'JetBrains Mono', 'SF Mono', ui-monospace, monospace;
--radius-sm: 8px;
--radius-md: 12px;
--radius-lg: 16px;
--rail-width: 200px;
}
/* ── Base ───────────────────────────────────────────────────── */
*, *::before, *::after { box-sizing: border-box; }
html, body {
margin: 0;
padding: 0;
background: var(--bg);
color: var(--text);
font-family: var(--font-ui);
letter-spacing: -0.01em;
min-height: 100vh;
}
body {
background:
radial-gradient(120% 80% at 90% -10%, var(--accent-glow), transparent 60%),
var(--bg);
}
/* ── Glass surface ──────────────────────────────────────────── */
.glass {
background: linear-gradient(180deg, rgba(255,255,255,0.04), rgba(255,255,255,0.015));
border: 1px solid var(--border);
border-radius: var(--radius-lg);
backdrop-filter: blur(20px);
box-shadow: 0 1px 0 rgba(255,255,255,0.04) inset, 0 24px 48px -24px rgba(0,0,0,0.5);
}
/* ── Mono label ─────────────────────────────────────────────── */
.mono { font-family: var(--font-mono); }
.label {
font-family: var(--font-mono);
font-size: 11px;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--text-muted);
}
/* ── Scrollbar ──────────────────────────────────────────────── */
::-webkit-scrollbar { width: 4px; }
::-webkit-scrollbar-track { background: transparent; }
::-webkit-scrollbar-thumb { background: rgba(255,255,255,0.1); border-radius: 2px; }
/* ── Focus ring ─────────────────────────────────────────────── */
:focus-visible { outline: 2px solid var(--accent); outline-offset: 2px; }
/* ── Animations ─────────────────────────────────────────────── */
@keyframes pulse { 0%, 100% { opacity: 1; } 50% { opacity: 0.3; } }
@keyframes blink { 0%, 50% { opacity: 1; } 51%, 100% { opacity: 0; } }
@keyframes spin { to { transform: rotate(360deg); } }

13
src/app.d.ts vendored Normal file
View File

@@ -0,0 +1,13 @@
// See https://svelte.dev/docs/kit/types#app.d.ts
// for information about these interfaces
declare global {
namespace App {
// interface Error {}
// interface Locals {}
// interface PageData {}
// interface PageState {}
// interface Platform {}
}
}
export {};

19
src/app.html Normal file
View File

@@ -0,0 +1,19 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="theme-color" content="#0c0d10" />
<meta name="description" content="Tonemark — fast audio and video transcription powered by Whisper" />
<link rel="manifest" href="/manifest.json" />
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
<link rel="apple-touch-icon" href="/icons/apple-touch-icon-180.png" />
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin="" />
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600&display=swap" />
%sveltekit.head%
</head>
<body data-sveltekit-preload-data="hover">
<div style="display: contents">%sveltekit.body%</div>
</body>
</html>

38
src/lib/accent.ts Normal file
View File

@@ -0,0 +1,38 @@
import { browser } from '$app/environment';
import { writable } from 'svelte/store';
export type AccentColor = {
value: string;
label: string;
/** RGB for glow/radial gradient — e.g. "205,242,78" */
rgb: string;
};
export const ACCENT_OPTIONS: AccentColor[] = [
{ value: '#cdf24e', label: 'Lime', rgb: '205,242,78' },
{ value: '#7cd4ff', label: 'Cyan', rgb: '124,212,255' },
{ value: '#ff8a5c', label: 'Coral', rgb: '255,138,92' },
{ value: '#c8a8ff', label: 'Lavender', rgb: '200,168,255' },
{ value: '#ffd86b', label: 'Amber', rgb: '255,216,107' },
];
const STORAGE_KEY = 'tonemark-accent';
const DEFAULT = ACCENT_OPTIONS[0];
function getInitial(): AccentColor {
if (!browser) return DEFAULT;
const stored = localStorage.getItem(STORAGE_KEY);
return ACCENT_OPTIONS.find((a) => a.value === stored) ?? DEFAULT;
}
export const accent = writable<AccentColor>(getInitial());
accent.subscribe((a) => {
if (!browser) return;
localStorage.setItem(STORAGE_KEY, a.value);
const root = document.documentElement;
root.style.setProperty('--accent', a.value);
root.style.setProperty('--accent-rgb', a.rgb);
// Update radial glow on body
root.style.setProperty('--accent-glow', `rgba(${a.rgb},0.06)`);
});

View File

@@ -0,0 +1,6 @@
<svg xmlns="http://www.w3.org/2000/svg" width="32" height="32" viewBox="0 0 32 32">
<rect width="32" height="32" rx="8" fill="#cdf24e"/>
<rect x="6" y="8" width="4" height="16" rx="2" fill="#0c0d10"/>
<rect x="14" y="12" width="4" height="8" rx="2" fill="#0c0d10"/>
<rect x="22" y="6" width="4" height="20" rx="2" fill="#0c0d10"/>
</svg>

After

Width:  |  Height:  |  Size: 345 B

View File

@@ -0,0 +1,296 @@
<script lang="ts">
/**
* RecordButton — microphone recorder with live waveform.
* Captures audio via MediaRecorder, shows a live AnalyserNode waveform.
* Emits `recorded` event with the audio Blob when stopped.
*/
import { onDestroy } from 'svelte';
interface Props {
accent?: string;
audioMode?: string;
ondone?: (blob: Blob, filename: string) => void;
}
let { accent = '#cdf24e', audioMode = 'auto', ondone }: Props = $props();
type RecordState = 'idle' | 'requesting' | 'recording' | 'stopping';
let state = $state<RecordState>('idle');
let error = $state('');
let elapsed = $state(0); // seconds
let liveData = $state<Float32Array | null>(null);
let mediaRecorder: MediaRecorder | null = null;
let chunks: Blob[] = [];
let stream: MediaStream | null = null;
let audioCtx: AudioContext | null = null;
let analyser: AnalyserNode | null = null;
let animFrame: number | null = null;
let timerInterval: ReturnType<typeof setInterval> | null = null;
function formatTime(s: number): string {
const m = Math.floor(s / 60);
const sec = s % 60;
return `${String(m).padStart(2, '0')}:${String(sec).padStart(2, '0')}`;
}
function startAnalyser(src: MediaStream) {
audioCtx = new AudioContext();
analyser = audioCtx.createAnalyser();
analyser.fftSize = 128;
analyser.smoothingTimeConstant = 0.7;
audioCtx.createMediaStreamSource(src).connect(analyser);
const buf = new Float32Array(analyser.frequencyBinCount);
function tick() {
if (!analyser) return;
analyser.getFloatFrequencyData(buf);
// Normalise dB range [-100, 0] → [0, 1]
const norm = new Float32Array(buf.length);
for (let i = 0; i < buf.length; i++) {
norm[i] = Math.max(0, Math.min(1, (buf[i] + 100) / 100));
}
liveData = norm;
animFrame = requestAnimationFrame(tick);
}
tick();
}
async function startRecording() {
error = '';
state = 'requesting';
try {
stream = await navigator.mediaDevices.getUserMedia({ audio: true });
} catch {
error = 'Microphone access denied';
state = 'idle';
return;
}
chunks = [];
mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = (e) => {
if (e.data.size > 0) chunks.push(e.data);
};
mediaRecorder.onstop = () => finalize();
mediaRecorder.start(100);
startAnalyser(stream);
elapsed = 0;
timerInterval = setInterval(() => elapsed++, 1000);
state = 'recording';
}
function stopRecording() {
state = 'stopping';
mediaRecorder?.stop();
if (timerInterval) clearInterval(timerInterval);
if (animFrame) cancelAnimationFrame(animFrame);
stream?.getTracks().forEach((t) => t.stop());
audioCtx?.close();
liveData = null;
}
function finalize() {
const mime = mediaRecorder?.mimeType ?? 'audio/webm';
const ext = mime.includes('ogg') ? 'ogg' : mime.includes('mp4') ? 'mp4' : 'webm';
const blob = new Blob(chunks, { type: mime });
const filename = `recording-${new Date().toISOString().slice(0, 19).replace(/[T:]/g, '-')}.${ext}`;
state = 'idle';
ondone?.(blob, filename);
}
onDestroy(() => {
if (timerInterval) clearInterval(timerInterval);
if (animFrame) cancelAnimationFrame(animFrame);
stream?.getTracks().forEach((t) => t.stop());
audioCtx?.close();
});
// Static waveform heights for idle state display
const IDLE_BARS = 48;
const idleHeights = Array.from(
{ length: IDLE_BARS },
(_, i) => 3 + Math.abs(Math.sin(i * 0.7) + Math.cos(i * 0.31)) * 20
);
</script>
<div class="recorder">
<!-- Waveform display -->
<div class="waveform-area" aria-hidden="true">
{#if state === 'recording' && liveData}
<!-- Live waveform from AnalyserNode -->
<svg viewBox="0 0 {IDLE_BARS * 5} 28" preserveAspectRatio="none" class="waveform-svg">
{#each Array.from(liveData).slice(0, IDLE_BARS) as v, i}
{@const h = 2 + v * 24}
<rect
x={i * 5}
y={(28 - h) / 2}
width="3"
height={h}
rx="1.5"
fill={accent}
opacity={0.5 + v * 0.5}
/>
{/each}
</svg>
{:else}
<!-- Static idle waveform -->
<svg viewBox="0 0 {IDLE_BARS * 5} 28" preserveAspectRatio="none" class="waveform-svg">
{#each idleHeights as h, i}
<rect
x={i * 5}
y={(28 - h) / 2}
width="3"
height={h}
rx="1.5"
fill={state === 'idle' ? 'rgba(255,255,255,0.15)' : accent}
opacity={state === 'idle' ? 1 : 0.3}
/>
{/each}
</svg>
{/if}
</div>
<!-- Timer (recording only) -->
{#if state === 'recording'}
<div class="timer" style="color: {accent}">
<span class="rec-dot" style="background: {accent}"></span>
{formatTime(elapsed)}
</div>
{/if}
<!-- Error -->
{#if error}
<p class="rec-error">{error}</p>
{/if}
<!-- Buttons -->
<div class="btn-row">
{#if state === 'idle' || state === 'requesting'}
<button
class="btn-record"
style="background: {accent}; color: #0c0d10;"
onclick={startRecording}
disabled={state === 'requesting'}
aria-label="Start recording"
>
{#if state === 'requesting'}
<svg width="13" height="13" viewBox="0 0 13 13" style="animation: spin 1s linear infinite">
<circle cx="6.5" cy="6.5" r="5" stroke="currentColor" stroke-width="1.5" fill="none" stroke-dasharray="20 12"/>
</svg>
Requesting…
{:else}
<svg width="13" height="13" viewBox="0 0 13 13">
<circle cx="6.5" cy="6.5" r="3" fill="currentColor" />
</svg>
Record
{/if}
</button>
{:else if state === 'recording'}
<button
class="btn-stop"
onclick={stopRecording}
aria-label="Stop recording and submit"
>
<svg width="12" height="12" viewBox="0 0 12 12">
<rect x="2" y="2" width="8" height="8" rx="1.5" fill="currentColor" />
</svg>
Stop &amp; Transcribe
</button>
{:else}
<div class="btn-record" style="background: rgba(255,255,255,0.06); color: var(--text-muted);">
<svg width="13" height="13" viewBox="0 0 13 13" style="animation: spin 1s linear infinite">
<circle cx="6.5" cy="6.5" r="5" stroke="currentColor" stroke-width="1.5" fill="none" stroke-dasharray="20 12"/>
</svg>
Saving…
</div>
{/if}
</div>
</div>
<style>
.recorder {
display: flex;
flex-direction: column;
gap: 12px;
}
.waveform-area {
height: 28px;
overflow: hidden;
}
.waveform-svg {
width: 100%;
height: 28px;
display: block;
}
.timer {
display: flex;
align-items: center;
gap: 8px;
font-family: var(--font-mono);
font-size: 14px;
font-weight: 600;
letter-spacing: 0.04em;
}
.rec-dot {
width: 7px;
height: 7px;
border-radius: 50%;
animation: pulse 1.2s infinite;
}
.rec-error {
margin: 0;
font-size: 12px;
color: #ff8a8a;
font-family: var(--font-mono);
}
.btn-row {
display: flex;
gap: 8px;
}
.btn-record,
.btn-stop {
flex: 1;
padding: 12px 0;
border-radius: 10px;
border: none;
font-family: inherit;
font-size: 13px;
font-weight: 600;
letter-spacing: -0.01em;
display: flex;
align-items: center;
justify-content: center;
gap: 6px;
cursor: pointer;
transition: opacity 0.15s, filter 0.15s;
}
.btn-record:hover:not(:disabled) {
filter: brightness(1.1);
}
.btn-record:disabled {
opacity: 0.6;
cursor: not-allowed;
}
.btn-stop {
background: rgba(255, 138, 138, 0.15);
border: 1px solid rgba(255, 138, 138, 0.3);
color: #ff8a8a;
}
.btn-stop:hover {
background: rgba(255, 138, 138, 0.22);
}
</style>

View File

@@ -0,0 +1,63 @@
<script lang="ts">
interface Props {
kind: 'youtube' | 'audio' | 'video' | 'mic' | 'file';
size?: number;
accent?: string;
}
let { kind, size = 36, accent = '#cdf24e' }: Props = $props();
const tint: Record<string, string> = $derived({
youtube: '#ff5e5e',
audio: accent,
video: '#7aa2ff',
mic: accent,
file: '#c5c8d3'
});
const color = $derived(tint[kind] ?? '#c5c8d3');
const r = $derived(size * 0.3);
</script>
<div
style="
width: {size}px; height: {size}px;
border-radius: {r}px;
background: color-mix(in oklab, {color} 12%, transparent);
border: 1px solid color-mix(in oklab, {color} 25%, transparent);
color: {color};
display: flex; align-items: center; justify-content: center;
flex-shrink: 0;
"
>
{#if kind === 'youtube'}
<svg width={size * 0.55} height={size * 0.55} viewBox="0 0 16 16" fill="none">
<path d="M2 4.5C2 3.4 2.9 2.5 4 2.5H12C13.1 2.5 14 3.4 14 4.5V11.5C14 12.6 13.1 13.5 12 13.5H4C2.9 13.5 2 12.6 2 11.5V4.5Z"
stroke="currentColor" stroke-width="1.3"/>
<path d="M7 6L10.5 8L7 10V6Z" fill="currentColor"/>
</svg>
{:else if kind === 'audio'}
<svg width={size * 0.55} height={size * 0.55} viewBox="0 0 16 16" fill="none">
<rect x="3" y="6" width="1.5" height="4" rx="0.7" fill="currentColor"/>
<rect x="5.5" y="4" width="1.5" height="8" rx="0.7" fill="currentColor"/>
<rect x="8" y="2" width="1.5" height="12" rx="0.7" fill="currentColor"/>
<rect x="10.5" y="5" width="1.5" height="6" rx="0.7" fill="currentColor"/>
</svg>
{:else if kind === 'video'}
<svg width={size * 0.55} height={size * 0.55} viewBox="0 0 16 16" fill="none">
<rect x="2" y="4" width="9" height="8" rx="1.5" stroke="currentColor" stroke-width="1.3"/>
<path d="M11 7L14 5V11L11 9V7Z" fill="currentColor"/>
</svg>
{:else if kind === 'mic'}
<svg width={size * 0.55} height={size * 0.55} viewBox="0 0 16 16" fill="none">
<rect x="5.5" y="1.5" width="5" height="7" rx="2.5" stroke="currentColor" stroke-width="1.3"/>
<path d="M3 7.5C3 10.5 5 12.5 8 12.5s5-2 5-5" stroke="currentColor" stroke-width="1.3" stroke-linecap="round"/>
<line x1="8" y1="12.5" x2="8" y2="14.5" stroke="currentColor" stroke-width="1.3" stroke-linecap="round"/>
</svg>
{:else}
<svg width={size * 0.55} height={size * 0.55} viewBox="0 0 16 16" fill="none">
<path d="M4 2H10L13 5V13C13 13.55 12.55 14 12 14H4C3.45 14 3 13.55 3 13V3C3 2.45 3.45 2 4 2Z"
stroke="currentColor" stroke-width="1.3"/>
</svg>
{/if}
</div>

View File

@@ -0,0 +1,93 @@
<script lang="ts">
/**
* Waveform — decorative or progress-indicating bar chart.
*
* progress: 0100. Bars up to that point are colored `accent`; the rest are dim.
* pattern: controls bar height distribution per audio processing mode.
* liveData: if provided, overrides the static pattern with real analyser data (Float32Array 01).
*/
interface Props {
bars?: number;
progress?: number;
accent?: string;
height?: number;
pattern?: 'default' | 'flat' | 'medium' | 'aggressive' | 'auto';
liveData?: Float32Array | null;
gap?: number;
}
let {
bars = 80,
progress = 0,
accent = '#cdf24e',
height = 40,
pattern = 'default',
liveData = null,
gap = 1.5
}: Props = $props();
// Deterministic bar height from index — no Math.random() to avoid hydration mismatch
function staticHeight(i: number): number {
const t = i / bars;
switch (pattern) {
case 'flat':
// None / Raw: very quiet, almost flat
return 2 + Math.abs(Math.sin(i * 1.2)) * 3;
case 'medium':
// Standard: gentle variation
return 3 + Math.abs(Math.sin(i * 0.6) + Math.cos(i * 0.25) * 0.5) * (height * 0.45);
case 'aggressive':
// Aggressive: loud peaks with lots of energy
return 3 + Math.abs(Math.sin(i * 0.42) * Math.cos(i * 0.17)) * (height * 0.9);
case 'auto':
// Auto: mixed natural-sounding
return 3 + Math.abs(Math.sin(i * 0.7) + Math.cos(i * 0.31)) * (height * 0.55);
default:
// Default / decorative waveform
return 6 + Math.abs(Math.sin(i * 0.7) + Math.cos(i * 0.31)) * (height * 0.55);
}
}
// Derived bar data — uses liveData if available, otherwise static
const barData = $derived.by(() => {
if (liveData && liveData.length > 0) {
// Downsample / upsample liveData to `bars` length
return Array.from({ length: bars }, (_, i) => {
const srcIdx = Math.floor((i / bars) * liveData.length);
return 2 + liveData[srcIdx] * (height - 2);
});
}
return Array.from({ length: bars }, (_, i) => staticHeight(i));
});
// Threshold index where accent color ends
const threshold = $derived(Math.floor((progress / 100) * bars));
</script>
<svg
viewBox="0 0 {bars * (100 / bars) * (1 + gap / 10)} {height}"
preserveAspectRatio="none"
style="width: 100%; height: {height}px; display: block;"
aria-hidden="true"
>
{#each barData as h, i}
{@const x = i * (100 / bars) * (1 + gap / 10)}
{@const w = Math.max(1, (100 / bars) * 0.72)}
{@const y = (height - h) / 2}
{@const colored = i < threshold}
<rect
{x}
{y}
width={w}
height={h}
rx="1"
fill={colored
? accent
: progress > 0
? `rgba(255,255,255,0.08)`
: `rgba(255,255,255,${0.05 + (i % 3) * 0.03})`}
opacity={colored ? (0.3 + (i / bars) * 0.7) : 1}
/>
{/each}
</svg>

1
src/lib/index.ts Normal file
View File

@@ -0,0 +1 @@
// place files you want to import through the `$lib` alias in this folder.

153
src/lib/server/audio.ts Normal file
View File

@@ -0,0 +1,153 @@
import { execFile } from 'child_process';
import { promisify } from 'util';
import { existsSync } from 'fs';
import { mkdir, unlink, rename } from 'fs/promises';
import { join } from 'path';
import type { AudioMode, AudioAnalysis } from '$lib/types.js';
const execFileAsync = promisify(execFile);
const TMP_DIR = join(process.env.DATA_DIR ?? '/tmp/.whisper-pwa', 'audio');
export async function ensureTmpDir() {
if (!existsSync(TMP_DIR)) await mkdir(TMP_DIR, { recursive: true });
}
export function tmpPath(jobId: string, suffix: string) {
return join(TMP_DIR, `${jobId}${suffix}`);
}
export async function cleanup(...paths: string[]) {
await Promise.allSettled(paths.map((p) => unlink(p).catch(() => {})));
}
/** Run ffmpeg volumedetect and return mean/max dB. */
export async function analyzeVolume(inputPath: string): Promise<AudioAnalysis> {
const { stderr } = await execFileAsync('ffmpeg', [
'-i', inputPath,
'-af', 'volumedetect',
'-vn', '-sn', '-dn',
'-f', 'null', '-'
]);
const meanMatch = stderr.match(/mean_volume:\s*([-\d.]+)\s*dB/);
const maxMatch = stderr.match(/max_volume:\s*([-\d.]+)\s*dB/);
return {
meanVolume: meanMatch ? parseFloat(meanMatch[1]) : -99,
maxVolume: maxMatch ? parseFloat(maxMatch[1]) : -99
};
}
/**
* Detect leading silence duration (ms).
* Only trims if silence begins at/near time 0 (< 0.5s).
* Capped at 30s to prevent accidental over-trimming.
*/
async function detectLeadingSilenceMs(inputPath: string): Promise<number> {
try {
const { stderr } = await execFileAsync('ffmpeg', [
'-i', inputPath,
'-af', 'silencedetect=n=-40dB:d=0.1',
'-vn', '-sn', '-dn',
'-f', 'null', '-'
]);
const startMatch = stderr.match(/silence_start:\s*([\d.]+)/);
const endMatch = stderr.match(/silence_end:\s*([\d.]+)/);
// Only trim if silence genuinely starts at the very beginning of the file
if (startMatch && endMatch && parseFloat(startMatch[1]) < 0.5) {
return Math.min(Math.floor(parseFloat(endMatch[1]) * 1000), 30_000);
}
} catch {
// ignore
}
return 0;
}
/** Build ffmpeg -af filter chain for the given mode and mean volume. */
export function buildFilterChain(mode: AudioMode, meanVolume: number): string | null {
const isQuiet = meanVolume < -30;
switch (mode) {
case 'none':
return null;
case 'standard':
return 'highpass=f=80,lowpass=f=8000,loudnorm=I=-16:LRA=11:TP=-1.5';
case 'aggressive':
return [
'highpass=f=80',
isQuiet ? 'volume=24dB,dynaudnorm=f=500:g=15' : null,
'lowpass=f=8000',
'afftdn=nf=-30',
'agate=threshold=0.01:attack=5:release=50',
'loudnorm=I=-16:LRA=11:TP=-1.5'
]
.filter(Boolean)
.join(',');
case 'auto':
default:
if (isQuiet) {
return [
'highpass=f=80',
'volume=24dB',
'dynaudnorm=f=500:g=15',
'lowpass=f=8000',
'afftdn=nf=-25',
'loudnorm=I=-16:LRA=11:TP=-1.5'
].join(',');
}
return 'highpass=f=80,lowpass=f=8000,loudnorm=I=-16:LRA=11:TP=-1.5';
}
}
/**
* Prepare audio for Whisper: convert to 16kHz mono WAV, trim leading silence,
* apply the appropriate filter chain.
* Returns path to the prepared WAV file.
*/
export async function prepareAudio(
inputPath: string,
jobId: string,
mode: AudioMode
): Promise<{ wavPath: string; analysis: AudioAnalysis }> {
await ensureTmpDir();
// Step 1: analyse volume on the original file
const analysis = await analyzeVolume(inputPath);
// Step 2: detect leading silence
const silenceMs = await detectLeadingSilenceMs(inputPath);
const wavPath = tmpPath(jobId, '.wav');
const filterChain = buildFilterChain(mode, analysis.meanVolume);
const args: string[] = ['-y'];
// Trim leading silence
if (silenceMs > 0) {
args.push('-ss', (silenceMs / 1000).toFixed(3));
}
args.push('-i', inputPath, '-ar', '16000', '-ac', '1');
if (filterChain) {
args.push('-af', filterChain);
}
args.push('-c:a', 'pcm_s16le', wavPath);
await execFileAsync('ffmpeg', args);
return { wavPath, analysis };
}
/** Move a file to a new path (cross-device safe). */
export async function moveFile(src: string, dest: string) {
try {
await rename(src, dest);
} catch {
const { copyFile } = await import('fs/promises');
await copyFile(src, dest);
await unlink(src);
}
}

137
src/lib/server/db.ts Normal file
View File

@@ -0,0 +1,137 @@
import Database from 'better-sqlite3';
import { randomUUID } from 'crypto';
import { existsSync, mkdirSync } from 'fs';
import { join } from 'path';
import type { Job, JobStatus, AudioMode, PushSubscription } from '$lib/types.js';
const DATA_DIR = process.env.DATA_DIR ?? join(process.env.HOME ?? '/tmp', '.whisper-pwa');
if (!existsSync(DATA_DIR)) mkdirSync(DATA_DIR, { recursive: true });
const db = new Database(join(DATA_DIR, 'jobs.db'));
db.pragma('journal_mode = WAL');
db.pragma('foreign_keys = ON');
db.exec(`
CREATE TABLE IF NOT EXISTS jobs (
id TEXT PRIMARY KEY,
status TEXT NOT NULL DEFAULT 'pending',
title TEXT NOT NULL DEFAULT '',
source TEXT NOT NULL DEFAULT '',
audio_mode TEXT NOT NULL DEFAULT 'auto',
mean_volume REAL,
whisper_job_id TEXT,
progress INTEGER NOT NULL DEFAULT 0,
output_dir TEXT,
segments_json TEXT,
error TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE IF NOT EXISTS push_subscriptions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
endpoint TEXT NOT NULL UNIQUE,
p256dh TEXT NOT NULL,
auth TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
`);
const stmts = {
insertJob: db.prepare(`
INSERT INTO jobs (id, status, title, source, audio_mode)
VALUES (@id, 'pending', @title, @source, @audioMode)
`),
updateJob: db.prepare(`
UPDATE jobs SET
status = @status,
title = @title,
mean_volume = @meanVolume,
whisper_job_id = @whisperJobId,
progress = @progress,
output_dir = @outputDir,
segments_json = @segmentsJson,
error = @error,
updated_at = datetime('now')
WHERE id = @id
`),
setStatus: db.prepare(`
UPDATE jobs SET status = @status, progress = @progress, updated_at = datetime('now')
WHERE id = @id
`),
getJob: db.prepare('SELECT * FROM jobs WHERE id = ?'),
listJobs: db.prepare('SELECT * FROM jobs ORDER BY created_at DESC, rowid DESC LIMIT 100'),
upsertSub: db.prepare(`
INSERT INTO push_subscriptions (endpoint, p256dh, auth)
VALUES (@endpoint, @p256dh, @auth)
ON CONFLICT(endpoint) DO UPDATE SET p256dh = excluded.p256dh, auth = excluded.auth
`),
allSubs: db.prepare('SELECT * FROM push_subscriptions'),
deleteSub: db.prepare('DELETE FROM push_subscriptions WHERE endpoint = ?')
};
function rowToJob(row: Record<string, unknown>): Job {
return {
id: row.id as string,
status: row.status as JobStatus,
title: row.title as string,
source: row.source as string,
audioMode: row.audio_mode as AudioMode,
meanVolume: row.mean_volume as number | null,
whisperJobId: row.whisper_job_id as string | null,
progress: row.progress as number,
outputDir: row.output_dir as string | null,
segmentsJson: row.segments_json as string | null,
error: row.error as string | null,
createdAt: row.created_at as string,
updatedAt: row.updated_at as string
};
}
export function createJob(source: string, title: string, audioMode: AudioMode): Job {
const id = randomUUID();
stmts.insertJob.run({ id, title, source, audioMode });
return getJob(id)!;
}
export function getJob(id: string): Job | null {
const row = stmts.getJob.get(id) as Record<string, unknown> | undefined;
return row ? rowToJob(row) : null;
}
export function listJobs(): Job[] {
return (stmts.listJobs.all() as Record<string, unknown>[]).map(rowToJob);
}
export function updateJob(job: Partial<Job> & { id: string }): void {
const current = getJob(job.id);
if (!current) return;
const merged = { ...current, ...job };
stmts.updateJob.run({
id: merged.id,
status: merged.status,
title: merged.title,
meanVolume: merged.meanVolume,
whisperJobId: merged.whisperJobId,
progress: merged.progress,
outputDir: merged.outputDir,
segmentsJson: merged.segmentsJson,
error: merged.error
});
}
export function setJobStatus(id: string, status: JobStatus, progress = 0): void {
stmts.setStatus.run({ id, status, progress });
}
export function savePushSubscription(sub: { endpoint: string; p256dh: string; auth: string }): void {
stmts.upsertSub.run(sub);
}
export function getAllSubscriptions(): PushSubscription[] {
return stmts.allSubs.all() as PushSubscription[];
}
export function deletePushSubscription(endpoint: string): void {
stmts.deleteSub.run(endpoint);
}

38
src/lib/server/docker.ts Normal file
View File

@@ -0,0 +1,38 @@
import { execFile } from 'child_process';
import { promisify } from 'util';
import { existsSync } from 'fs';
const execFileAsync = promisify(execFile);
const COMPOSE_FILE = process.env.WHISPER_COMPOSE_FILE ?? '';
const WHISPER_URL = process.env.WHISPER_URL ?? 'http://localhost:8080';
/** Poll /health until ready or timeout (ms). */
export async function waitForHealth(timeoutMs = 60_000): Promise<boolean> {
const { default: fetch } = await import('node-fetch');
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
try {
const res = await fetch(`${WHISPER_URL}/health`, { signal: AbortSignal.timeout(2000) });
if (res.ok) return true;
} catch { /* not ready yet */ }
await new Promise((r) => setTimeout(r, 2000));
}
return false;
}
/** Start the whisper-rtx2080 container if a compose file is configured. */
export async function ensureWhisperRunning(): Promise<void> {
if (!COMPOSE_FILE || !existsSync(COMPOSE_FILE)) {
// No compose file configured — assume server is already running
return;
}
const healthy = await waitForHealth(3000);
if (healthy) return;
console.log('[docker] Starting whisper-rtx2080...');
await execFileAsync('docker', ['compose', '-f', COMPOSE_FILE, 'up', '-d']);
const ok = await waitForHealth(120_000);
if (!ok) throw new Error('whisper-rtx2080 did not become healthy within 2 minutes');
console.log('[docker] whisper-rtx2080 is ready');
}

View File

@@ -0,0 +1,162 @@
import { execFile } from 'child_process';
import { promisify } from 'util';
import { existsSync } from 'fs';
import { mkdir, unlink, writeFile } from 'fs/promises';
import { join } from 'path';
const execFileAsync = promisify(execFile);
const TMP_DIR = join(process.env.DATA_DIR ?? '/tmp/.whisper-pwa', 'downloads');
export async function ensureTmpDir() {
if (!existsSync(TMP_DIR)) await mkdir(TMP_DIR, { recursive: true });
}
export interface CaptionResult {
type: 'captions';
segments: Array<{ index: number; start: number; end: number; text: string; words: [] }>;
title: string;
}
export interface AudioResult {
type: 'audio';
audioPath: string;
title: string;
}
export type DownloadResult = CaptionResult | AudioResult;
/** Try to get auto-generated captions from YouTube. Returns null if unavailable. */
async function tryGetCaptions(url: string, outDir: string): Promise<CaptionResult | null> {
const jsonPath = join(outDir, 'info.json');
try {
await execFileAsync('yt-dlp', [
'--write-auto-subs',
'--sub-langs', 'en.*',
'--skip-download',
'--write-info-json',
'--no-playlist',
'-o', join(outDir, '%(title)s.%(ext)s'),
url
]);
// Find the VTT/SRT file
const { readdirSync } = await import('fs');
const files = readdirSync(outDir);
const vttFile = files.find((f) => f.endsWith('.vtt') || f.endsWith('.srt'));
if (!vttFile) return null;
let title = 'Untitled';
if (existsSync(jsonPath)) {
try {
const info = JSON.parse((await import('fs')).readFileSync(jsonPath, 'utf8'));
title = info.title ?? title;
} catch { /* ignore */ }
}
const content = (await import('fs')).readFileSync(join(outDir, vttFile), 'utf8');
const segments = parseVtt(content);
if (segments.length === 0) return null;
return { type: 'captions', segments, title };
} catch {
return null;
}
}
/** Download best audio from YouTube. Returns path to audio file. */
async function downloadAudio(url: string, outDir: string): Promise<{ audioPath: string; title: string }> {
await execFileAsync('yt-dlp', [
'-f', 'bestaudio',
'--write-info-json',
'--no-playlist',
'-o', join(outDir, 'audio.%(ext)s'),
url
]);
const { readdirSync, readFileSync } = await import('fs');
const files = readdirSync(outDir);
const audioFile = files.find((f) => f.startsWith('audio.') && !f.endsWith('.json'));
if (!audioFile) throw new Error('yt-dlp did not produce an audio file');
let title = 'Untitled';
const jsonFile = files.find((f) => f.endsWith('.info.json'));
if (jsonFile) {
try {
title = JSON.parse(readFileSync(join(outDir, jsonFile), 'utf8')).title ?? title;
} catch { /* ignore */ }
}
return { audioPath: join(outDir, audioFile), title };
}
/** Download a YouTube URL: try captions first, fall back to audio. */
export async function downloadYouTube(url: string, jobId: string): Promise<DownloadResult> {
await ensureTmpDir();
const outDir = join(TMP_DIR, jobId);
await mkdir(outDir, { recursive: true });
const captions = await tryGetCaptions(url, outDir);
if (captions) return captions;
const { audioPath, title } = await downloadAudio(url, outDir);
return { type: 'audio', audioPath, title };
}
/** Save an uploaded file to tmp dir and return its path. */
export async function saveUploadedFile(
buffer: Buffer,
filename: string,
jobId: string
): Promise<string> {
await ensureTmpDir();
const outDir = join(TMP_DIR, jobId);
await mkdir(outDir, { recursive: true });
const dest = join(outDir, filename);
await writeFile(dest, buffer);
return dest;
}
export async function cleanupJobTmp(jobId: string) {
const outDir = join(TMP_DIR, jobId);
try {
const { rm } = await import('fs/promises');
await rm(outDir, { recursive: true, force: true });
} catch { /* ignore */ }
}
/** Parse a WebVTT string into segments. */
function parseVtt(
content: string
): Array<{ index: number; start: number; end: number; text: string; words: [] }> {
const segments: Array<{ index: number; start: number; end: number; text: string; words: [] }> = [];
const blocks = content.split(/\n\n+/);
let index = 0;
for (const block of blocks) {
const lines = block.trim().split('\n');
const timeLine = lines.find((l) => l.includes('-->'));
if (!timeLine) continue;
const [startStr, endStr] = timeLine.split('-->').map((s) => s.trim().split(' ')[0]);
const start = vttTimeToSec(startStr);
const end = vttTimeToSec(endStr);
const text = lines
.filter((l) => !l.includes('-->') && !/^\d+$/.test(l.trim()) && l.trim())
.join(' ')
.replace(/<[^>]+>/g, '')
.trim();
if (text) {
segments.push({ index: index++, start, end, text, words: [] });
}
}
return segments;
}
function vttTimeToSec(t: string): number {
const parts = t.split(':').map(Number);
if (parts.length === 3) return parts[0] * 3600 + parts[1] * 60 + parts[2];
if (parts.length === 2) return parts[0] * 60 + parts[1];
return parts[0];
}

View File

@@ -0,0 +1,98 @@
import { writeFile, mkdir } from 'fs/promises';
import { join } from 'path';
import type { Segment } from '$lib/types.js';
function secToSrtTime(sec: number): string {
const h = Math.floor(sec / 3600);
const m = Math.floor((sec % 3600) / 60);
const s = Math.floor(sec % 60);
const ms = Math.round((sec % 1) * 1000);
return `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(s).padStart(2, '0')},${String(ms).padStart(3, '0')}`;
}
function secToVttTime(sec: number): string {
return secToSrtTime(sec).replace(',', '.');
}
export function buildSrt(segments: Segment[]): string {
return segments
.map(
(s, i) =>
`${i + 1}\n${secToSrtTime(s.start)} --> ${secToSrtTime(s.end)}\n${s.text.trim()}`
)
.join('\n\n');
}
/** Pure text — no timestamps, clean paragraphs. */
export function buildTxt(segments: Segment[]): string {
const lines: string[] = [];
let para = '';
for (const seg of segments) {
const t = seg.text.trim();
if (!t) continue;
para += (para ? ' ' : '') + t;
// Start a new paragraph on sentence-ending punctuation followed by enough text
if (/[.!?]$/.test(t) && para.length > 200) {
lines.push(para);
para = '';
}
}
if (para) lines.push(para);
return lines.join('\n\n');
}
export function buildMd(segments: Segment[], title: string): string {
const lines: string[] = [`# ${title}\n`];
let lastHeadingSec = -Infinity;
for (const seg of segments) {
// Add a timestamp heading roughly every 5 minutes
if (seg.start - lastHeadingSec >= 300) {
const t = secToVttTime(seg.start).split('.')[0];
lines.push(`\n## ${t}\n`);
lastHeadingSec = seg.start;
}
lines.push(seg.text.trim());
}
return lines.join('\n');
}
export function buildJson(segments: Segment[], title: string): string {
return JSON.stringify({ title, segments }, null, 2);
}
export interface OutputPaths {
srt: string;
txt: string;
md: string;
json: string;
}
/** Write all 4 output formats to OUTPUT_DIR/{safeTitle}/. */
export async function writeOutputs(
segments: Segment[],
title: string,
jobId: string
): Promise<OutputPaths> {
const baseDir = process.env.OUTPUT_DIR ?? join(process.env.HOME ?? '/tmp', 'transcripts');
const safeTitle = title.replace(/[^\w\s-]/g, '').replace(/\s+/g, '_').slice(0, 80) || jobId;
const outDir = join(baseDir, safeTitle);
await mkdir(outDir, { recursive: true });
const paths: OutputPaths = {
srt: join(outDir, `${safeTitle}.srt`),
txt: join(outDir, `${safeTitle}.txt`),
md: join(outDir, `${safeTitle}.md`),
json: join(outDir, `${safeTitle}.json`)
};
await Promise.all([
writeFile(paths.srt, buildSrt(segments), 'utf8'),
writeFile(paths.txt, buildTxt(segments), 'utf8'),
writeFile(paths.md, buildMd(segments, title), 'utf8'),
writeFile(paths.json, buildJson(segments, title), 'utf8')
]);
return paths;
}

153
src/lib/server/pipeline.ts Normal file
View File

@@ -0,0 +1,153 @@
import { createJob, updateJob, setJobStatus, getJob } from './db.js';
import { downloadYouTube, saveUploadedFile, cleanupJobTmp } from './downloader.js';
import { prepareAudio, cleanup as cleanupFiles } from './audio.js';
import { submitJob, streamJob } from './whisper.js';
import { ensureWhisperRunning } from './docker.js';
import type { AudioMode, Segment } from '$lib/types.js';
const WEBHOOK_BASE_URL = process.env.WEBHOOK_BASE_URL ?? 'http://localhost:3000';
/** Progress listeners: jobId → set of callbacks */
const progressListeners = new Map<string, Set<(data: string) => void>>();
export function subscribeProgress(jobId: string, cb: (data: string) => void): () => void {
if (!progressListeners.has(jobId)) progressListeners.set(jobId, new Set());
progressListeners.get(jobId)!.add(cb);
return () => progressListeners.get(jobId)?.delete(cb);
}
export function emitProgress(jobId: string, payload: object) {
const listeners = progressListeners.get(jobId);
if (!listeners) return;
const data = JSON.stringify(payload);
for (const cb of listeners) cb(data);
}
/** Start a transcription job for a YouTube URL. Runs async — returns immediately. */
export async function startYouTubeJob(
url: string,
audioMode: AudioMode = 'auto',
language?: string
): Promise<string> {
const job = createJob(url, 'Downloading…', audioMode);
runJob(job.id, { type: 'youtube', url }, audioMode, language).catch((err) => {
console.error(`[pipeline] job ${job.id} failed:`, err);
});
return job.id;
}
/** Start a transcription job for an uploaded file. Runs async — returns immediately. */
export async function startUploadJob(
buffer: Buffer,
filename: string,
audioMode: AudioMode = 'auto',
language?: string
): Promise<string> {
const job = createJob(filename, filename, audioMode);
runJob(job.id, { type: 'upload', buffer, filename }, audioMode, language).catch((err) => {
console.error(`[pipeline] job ${job.id} failed:`, err);
});
return job.id;
}
async function runJob(
jobId: string,
input: { type: 'youtube'; url: string } | { type: 'upload'; buffer: Buffer; filename: string },
audioMode: AudioMode,
language?: string
) {
let rawAudioPath: string | null = null;
let wavPath: string | null = null;
try {
// ── 1. Download / save input ──────────────────────────────────────────
setJobStatus(jobId, 'downloading', 0);
emitProgress(jobId, { type: 'status', status: 'downloading' });
let title = 'Untitled';
let captionSegments: Segment[] | null = null;
if (input.type === 'youtube') {
const result = await downloadYouTube(input.url, jobId);
if (result.type === 'captions') {
// Fast path — use captions directly
title = result.title;
captionSegments = result.segments;
} else {
rawAudioPath = result.audioPath;
title = result.title;
}
} else {
rawAudioPath = await saveUploadedFile(input.buffer, input.filename, jobId);
title = input.filename.replace(/\.[^.]+$/, '');
}
updateJob({ id: jobId, title });
if (captionSegments) {
// Caption fast path — skip whisper
const { deduplicateSegments } = await import('./postprocess.js');
const { writeOutputs } = await import('./formatter.js');
const segments = deduplicateSegments(captionSegments);
const paths = await writeOutputs(segments, title, jobId);
updateJob({
id: jobId,
status: 'done',
progress: 100,
segmentsJson: JSON.stringify(segments),
outputDir: paths.srt.replace(/\/[^/]+$/, '')
});
emitProgress(jobId, { type: 'done' });
const { sendNotification } = await import('./push.js');
await sendNotification(jobId, '✅ Transcript ready', title);
await cleanupJobTmp(jobId);
return;
}
// ── 2. Prepare audio ─────────────────────────────────────────────────
setJobStatus(jobId, 'preparing', 5);
emitProgress(jobId, { type: 'status', status: 'preparing' });
const { wavPath: wp, analysis } = await prepareAudio(rawAudioPath!, jobId, audioMode);
wavPath = wp;
updateJob({ id: jobId, meanVolume: analysis.meanVolume });
// ── 3. Ensure whisper is running ──────────────────────────────────────
await ensureWhisperRunning();
// ── 4. Submit to whisper with webhook ────────────────────────────────
setJobStatus(jobId, 'transcribing', 10);
emitProgress(jobId, { type: 'status', status: 'transcribing' });
const webhookUrl = `${WEBHOOK_BASE_URL}/api/webhook/${jobId}`;
const whisperJobId = await submitJob(wavPath, webhookUrl, language);
updateJob({ id: jobId, whisperJobId });
// ── 5. Open SSE for live progress (non-blocking relay) ───────────────
streamJob(
whisperJobId,
(percent, chunk, total) => {
const progress = 10 + Math.round(percent * 0.8);
setJobStatus(jobId, 'transcribing', progress);
emitProgress(jobId, { type: 'progress', percent, chunk, total, progress });
},
() => { /* webhook will handle completion */ },
(msg) => {
setJobStatus(jobId, 'failed', 0);
updateJob({ id: jobId, error: msg });
emitProgress(jobId, { type: 'error', message: msg });
}
).catch((err) => console.warn('[pipeline] SSE relay error:', err));
// Clean up wav after submitting (webhook handles the rest)
await cleanupFiles(wavPath);
wavPath = null;
} catch (err: unknown) {
const message = err instanceof Error ? err.message : String(err);
updateJob({ id: jobId, status: 'failed', error: message });
emitProgress(jobId, { type: 'error', message });
if (rawAudioPath) await cleanupFiles(rawAudioPath).catch(() => {});
if (wavPath) await cleanupFiles(wavPath).catch(() => {});
await cleanupJobTmp(jobId);
}
}

View File

@@ -0,0 +1,108 @@
import type { Segment } from '$lib/types.js';
// ── Collapse consecutive repeated phrases within a segment's text ────────────
function collapseRepeats(text: string): string {
let prev = '';
// Keep applying until stable
while (true) {
const next = collapseOnce(text);
if (next === prev || next === text) return next;
prev = text;
text = next;
}
}
function collapseOnce(text: string): string {
// Match any repeated phrase (2+ words) appearing consecutively
return text.replace(/\b(.{10,}?)\s+\1\b/gi, '$1');
}
// ── Merge consecutive segments with identical (or near-identical) text ───────
function normalise(s: string) {
return s.toLowerCase().replace(/[^\w\s]/g, '').replace(/\s+/g, ' ').trim();
}
function mergeConsecutive(segments: Segment[]): Segment[] {
const out: Segment[] = [];
for (const seg of segments) {
const last = out[out.length - 1];
if (last && normalise(last.text) === normalise(seg.text)) {
last.end = seg.end;
} else {
out.push({ ...seg });
}
}
return out;
}
// ── N-gram deduplication ─────────────────────────────────────────────────────
const NGRAM_N = 6;
const LOOKBACK_CHARS = 500;
const SIMILARITY_THRESHOLD = 0.6;
function ngrams(text: string, n: number): string[] {
const words = text.toLowerCase().split(/\s+/);
const grams: string[] = [];
for (let i = 0; i <= words.length - n; i++) {
grams.push(words.slice(i, i + n).join(' '));
}
return grams;
}
function jaccardSimilarity(a: string, b: string): number {
const ga = new Set(ngrams(a, NGRAM_N));
const gb = new Set(ngrams(b, NGRAM_N));
// If neither text is long enough to produce n-grams they cannot be compared;
// treat as dissimilar so short segments are never incorrectly discarded.
if (ga.size === 0 && gb.size === 0) return 0;
const intersection = [...ga].filter((g) => gb.has(g)).length;
const union = new Set([...ga, ...gb]).size;
return union === 0 ? 0 : intersection / union;
}
function ngramDedup(segments: Segment[]): Segment[] {
const out: Segment[] = [];
for (const seg of segments) {
const windowText = out
.slice(-20)
.map((s) => s.text)
.join(' ')
.slice(-LOOKBACK_CHARS);
if (windowText.length > 0 && jaccardSimilarity(seg.text, windowText) >= SIMILARITY_THRESHOLD) {
continue; // duplicate — skip
}
out.push(seg);
}
return out;
}
// ── Full deduplication pipeline ──────────────────────────────────────────────
export function deduplicateSegments(segments: Segment[]): Segment[] {
// 1. Collapse repeats within each segment's text
let result = segments.map((s) => ({
...s,
text: collapseRepeats(s.text.trim())
}));
// 2. Remove empty segments
result = result.filter((s) => s.text.length > 0);
// 3. First merge pass
result = mergeConsecutive(result);
// 4. N-gram dedup
result = ngramDedup(result);
// 5. Second merge pass (catches new adjacencies after dedup)
result = mergeConsecutive(result);
// 6. Re-index
result.forEach((s, i) => (s.index = i));
return result;
}

47
src/lib/server/push.ts Normal file
View File

@@ -0,0 +1,47 @@
import webpush from 'web-push';
import { getAllSubscriptions, deletePushSubscription } from './db.js';
let initialised = false;
function init() {
if (initialised) return;
const pub = process.env.VAPID_PUBLIC_KEY;
const priv = process.env.VAPID_PRIVATE_KEY;
const sub = process.env.VAPID_SUBJECT ?? 'mailto:admin@localhost';
if (!pub || !priv) {
console.warn('[push] VAPID keys not set — push notifications disabled');
return;
}
webpush.setVapidDetails(sub, pub, priv);
initialised = true;
}
export function getVapidPublicKey(): string | null {
return process.env.VAPID_PUBLIC_KEY || null;
}
export async function sendNotification(jobId: string, title: string, body: string): Promise<void> {
init();
if (!initialised) return;
const subscriptions = getAllSubscriptions();
const payload = JSON.stringify({ jobId, title, body });
await Promise.allSettled(
subscriptions.map(async (sub) => {
try {
await webpush.sendNotification(
{ endpoint: sub.endpoint, keys: { p256dh: sub.p256dh, auth: sub.auth } },
payload
);
} catch (err: unknown) {
const status = (err as { statusCode?: number }).statusCode;
if (status === 410 || status === 404) {
deletePushSubscription(sub.endpoint);
} else {
console.error('[push] send failed:', err);
}
}
})
);
}

91
src/lib/server/whisper.ts Normal file
View File

@@ -0,0 +1,91 @@
import { execFile } from 'child_process';
import { promisify } from 'util';
const execFileAsync = promisify(execFile);
function whisperUrl() {
return process.env.WHISPER_URL ?? 'http://localhost:8080';
}
/** Submit an audio file to whisper-rtx2080. Returns the whisper job id. */
export async function submitJob(
wavPath: string,
webhookUrl: string,
language?: string
): Promise<string> {
const FormData = (await import('form-data')).default;
const { createReadStream } = await import('fs');
const { default: fetch } = await import('node-fetch');
const form = new FormData();
form.append('audio', createReadStream(wavPath));
form.append('task', 'transcribe');
form.append('webhook_url', webhookUrl);
if (language) form.append('language', language);
const res = await fetch(`${whisperUrl()}/jobs`, {
method: 'POST',
body: form,
headers: form.getHeaders()
});
if (!res.ok) {
const text = await res.text();
throw new Error(`whisper /jobs returned ${res.status}: ${text}`);
}
const json = (await res.json()) as { job_id: string };
return json.job_id;
}
/** Open an SSE stream from whisper and call onProgress/onDone callbacks. */
export async function streamJob(
whisperJobId: string,
onProgress: (percent: number, chunk: number, total: number) => void,
onDone: () => void,
onError: (msg: string) => void
): Promise<void> {
const { default: fetch } = await import('node-fetch');
const res = await fetch(`${whisperUrl()}/jobs/${whisperJobId}/stream`);
if (!res.ok || !res.body) throw new Error(`SSE stream returned ${res.status}`);
let buf = '';
for await (const chunk of res.body) {
buf += chunk.toString();
const lines = buf.split('\n');
buf = lines.pop() ?? '';
let eventType = '';
let dataLine = '';
for (const line of lines) {
if (line.startsWith('event:')) eventType = line.slice(6).trim();
else if (line.startsWith('data:')) dataLine = line.slice(5).trim();
}
if (!dataLine) continue;
try {
const payload = JSON.parse(dataLine);
if (payload.type === 'progress') {
onProgress(payload.percent ?? 0, payload.chunk ?? 0, payload.total ?? 0);
} else if (payload.type === 'done') {
onDone();
return;
} else if (payload.type === 'error') {
onError(payload.message ?? 'unknown error');
return;
}
} catch { /* ignore parse errors */ }
}
}
/** Check if the whisper server is healthy. */
export async function checkHealth(): Promise<boolean> {
try {
const { default: fetch } = await import('node-fetch');
const res = await fetch(`${whisperUrl()}/health`, { signal: AbortSignal.timeout(3000) });
return res.ok;
} catch {
return false;
}
}

60
src/lib/types.ts Normal file
View File

@@ -0,0 +1,60 @@
export type AudioMode = 'auto' | 'standard' | 'aggressive' | 'none';
export type JobStatus = 'pending' | 'downloading' | 'preparing' | 'transcribing' | 'processing' | 'done' | 'failed' | 'cancelled';
export interface Segment {
index: number;
start: number;
end: number;
text: string;
words: Word[];
}
export interface Word {
start: number;
end: number;
word: string;
probability?: number;
}
export interface Job {
id: string;
status: JobStatus;
title: string;
source: string;
audioMode: AudioMode;
meanVolume: number | null;
whisperJobId: string | null;
progress: number;
outputDir: string | null;
segmentsJson: string | null;
error: string | null;
createdAt: string;
updatedAt: string;
}
export interface WhisperJob {
id: string;
status: 'queued' | 'running' | 'done' | 'failed' | 'cancelled';
language: string | null;
task: string;
duration_secs: number | null;
progress: number;
segments: Segment[];
error: string | null;
created_at: string;
completed_at: string | null;
}
export interface PushSubscription {
id: number;
endpoint: string;
p256dh: string;
auth: string;
createdAt: string;
}
export interface AudioAnalysis {
meanVolume: number;
maxVolume: number;
}

329
src/routes/+layout.svelte Normal file
View File

@@ -0,0 +1,329 @@
<script lang="ts">
import '../app.css';
import { onMount } from 'svelte';
import { browser } from '$app/environment';
import { page } from '$app/stores';
import { accent } from '$lib/accent.js';
let { children } = $props();
// Initialize accent (triggers subscriber which sets CSS vars)
// The store subscriber handles everything; just subscribing here keeps it alive.
$effect(() => { void $accent; });
// Push notification setup
onMount(async () => {
if (!browser || !('serviceWorker' in navigator) || !('PushManager' in window)) return;
try {
const reg = await navigator.serviceWorker.ready;
const res = await fetch('/api/push');
if (!res.ok) return;
const { publicKey } = await res.json();
const existing = await reg.pushManager.getSubscription();
const sub =
existing ??
(await reg.pushManager.subscribe({
userVisibleOnly: true,
applicationServerKey: urlBase64ToUint8Array(publicKey).buffer as ArrayBuffer
}));
await fetch('/api/push', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
endpoint: sub.endpoint,
keys: {
p256dh: arrayBufferToBase64(sub.getKey('p256dh')!),
auth: arrayBufferToBase64(sub.getKey('auth')!)
}
})
});
} catch (err) {
console.warn('[push] setup failed:', err);
}
});
function urlBase64ToUint8Array(base64: string): Uint8Array {
const pad = '='.repeat((4 - (base64.length % 4)) % 4);
const b64 = (base64 + pad).replace(/-/g, '+').replace(/_/g, '/');
const raw = atob(b64);
return Uint8Array.from(raw, (c) => c.charCodeAt(0));
}
function arrayBufferToBase64(buf: ArrayBuffer): string {
return btoa(String.fromCharCode(...new Uint8Array(buf)));
}
// Derive active nav item from current path
const active = $derived.by(() => {
const path = $page.url.pathname;
if (path.startsWith('/jobs')) return 'queue';
if (path.startsWith('/settings')) return 'settings';
return 'home';
});
const navItems = [
{
k: 'home',
label: 'Home',
href: '/',
icon: 'M2 7L8 2.5L14 7V13.5H10V9.5H6V13.5H2V7Z'
},
{
k: 'queue',
label: 'Queue',
href: '/jobs',
icon: 'M2 4H14M2 8H14M2 12H10',
stroke: true
},
{
k: 'settings',
label: 'Settings',
href: '/settings',
icon: 'M8 5.5A2.5 2.5 0 1 1 8 10.5A2.5 2.5 0 0 1 8 5.5ZM8 1V3M8 13V15M3.5 3.5L4.9 4.9M11.1 11.1L12.5 12.5M1 8H3M13 8H15M3.5 12.5L4.9 11.1M11.1 4.9L12.5 3.5',
stroke: true
}
];
</script>
<svelte:head>
<link rel="icon" href="/favicon.svg" type="image/svg+xml" />
<link rel="manifest" href="/manifest.json" />
</svelte:head>
<div class="app-shell">
<!-- ── Rail nav (desktop) ─────────────────────────────── -->
<nav class="rail" aria-label="Main navigation">
<!-- Logo -->
<a href="/" class="logo" aria-label="Tonemark home">
<div class="logo-icon">
<svg width="16" height="16" viewBox="0 0 32 32" fill="none" aria-hidden="true">
<rect x="6" y="8" width="4" height="16" rx="2" fill="#0c0d10" />
<rect x="14" y="12" width="4" height="8" rx="2" fill="#0c0d10" />
<rect x="22" y="6" width="4" height="20" rx="2" fill="#0c0d10" />
</svg>
</div>
<span class="logo-name">Tonemark<span class="logo-dot">·</span></span>
</a>
<!-- Nav items -->
<div class="nav-items">
{#each navItems as item}
<a
href={item.href}
class="nav-item"
class:active={active === item.k}
aria-current={active === item.k ? 'page' : undefined}
>
{#if active === item.k}
<div class="nav-indicator" aria-hidden="true"></div>
{/if}
<svg width="16" height="16" viewBox="0 0 16 16" aria-hidden="true">
<path
d={item.icon}
fill={item.stroke ? 'none' : 'currentColor'}
stroke={item.stroke ? 'currentColor' : 'none'}
stroke-width="1.4"
stroke-linecap="round"
stroke-linejoin="round"
/>
</svg>
{item.label}
</a>
{/each}
</div>
<div class="rail-spacer"></div>
<!-- Status dot -->
<div class="status-pill">
<div class="status-dot"></div>
<span>whisper-large-v3</span>
</div>
</nav>
<!-- ── Main content ───────────────────────────────────── -->
<main class="main-content">
{@render children()}
</main>
<!-- ── Tab bar (mobile) ───────────────────────────────── -->
<nav class="tabbar" aria-label="Mobile navigation">
{#each navItems as item}
<a href={item.href} class="tab-item" class:active={active === item.k}>
<svg width="22" height="22" viewBox="0 0 16 16" aria-hidden="true">
<path
d={item.icon}
fill={item.stroke ? 'none' : 'currentColor'}
stroke={item.stroke ? 'currentColor' : 'none'}
stroke-width="1.4"
stroke-linecap="round"
stroke-linejoin="round"
/>
</svg>
<span>{item.label}</span>
</a>
{/each}
</nav>
</div>
<style>
.app-shell {
display: flex;
min-height: 100vh;
min-height: 100dvh;
}
/* ── Rail ─────────────────────────────────────────────── */
.rail {
width: var(--rail-width);
flex-shrink: 0;
border-right: 1px solid var(--border);
display: flex;
flex-direction: column;
padding: 24px 14px;
background: rgba(255, 255, 255, 0.015);
position: sticky;
top: 0;
height: 100vh;
overflow-y: auto;
}
.logo {
display: flex;
align-items: center;
gap: 10px;
padding: 0 8px 28px;
text-decoration: none;
color: var(--text);
}
.logo-icon {
width: 28px;
height: 28px;
border-radius: 8px;
background: var(--accent);
display: flex;
align-items: center;
justify-content: center;
flex-shrink: 0;
}
.logo-name {
font-size: 15px;
font-weight: 600;
letter-spacing: -0.02em;
}
.logo-dot {
color: rgba(232, 233, 236, 0.4);
font-weight: 400;
}
.nav-items {
display: flex;
flex-direction: column;
gap: 2px;
}
.nav-item {
display: flex;
align-items: center;
gap: 12px;
padding: 9px 12px;
border-radius: 8px;
font-size: 13.5px;
font-weight: 500;
color: var(--text-muted);
text-decoration: none;
position: relative;
transition: background 0.15s, color 0.15s;
}
.nav-item:hover {
background: rgba(255, 255, 255, 0.04);
color: var(--text);
}
.nav-item.active {
background: rgba(255, 255, 255, 0.06);
color: #fff;
}
.nav-indicator {
position: absolute;
left: -14px;
top: 8px;
bottom: 8px;
width: 2px;
border-radius: 2px;
background: var(--accent);
}
.rail-spacer { flex: 1; }
.status-pill {
display: flex;
align-items: center;
gap: 8px;
padding: 10px 14px;
border-radius: 10px;
border: 1px solid var(--border);
font-size: 11px;
font-family: var(--font-mono);
color: var(--text-muted);
}
.status-dot {
width: 6px;
height: 6px;
border-radius: 3px;
background: #5dd47a;
flex-shrink: 0;
}
/* ── Main content ─────────────────────────────────────── */
.main-content {
flex: 1;
min-width: 0;
overflow-y: auto;
}
/* ── Tab bar (mobile only) ────────────────────────────── */
.tabbar {
display: none;
}
/* ── Responsive ───────────────────────────────────────── */
@media (max-width: 768px) {
.rail {
display: none;
}
.app-shell {
flex-direction: column;
padding-bottom: 72px;
}
.tabbar {
display: flex;
position: fixed;
bottom: 0;
left: 0;
right: 0;
border-top: 1px solid var(--border);
background: rgba(12, 13, 16, 0.9);
backdrop-filter: blur(20px);
padding: 10px 0 max(30px, env(safe-area-inset-bottom));
justify-content: space-around;
z-index: 100;
}
.tab-item {
display: flex;
flex-direction: column;
align-items: center;
gap: 4px;
color: rgba(232, 233, 236, 0.4);
text-decoration: none;
transition: color 0.15s;
}
.tab-item.active {
color: var(--accent);
}
.tab-item span {
font-size: 9.5px;
font-family: var(--font-mono);
text-transform: uppercase;
letter-spacing: 0.06em;
}
}
</style>

588
src/routes/+page.svelte Normal file
View File

@@ -0,0 +1,588 @@
<script lang="ts">
import { onMount } from 'svelte';
import type { Job, AudioMode } from '$lib/types.js';
import SourceIcon from '$lib/components/SourceIcon.svelte';
import Waveform from '$lib/components/Waveform.svelte';
import RecordButton from '$lib/components/RecordButton.svelte';
const ACCENT = '#cdf24e';
let url = $state('');
let audioMode = $state<AudioMode>('auto');
let jobs = $state<Job[]>([]);
let submitting = $state(false);
let error = $state('');
let dragOver = $state(false);
let fileInput = $state<HTMLInputElement | null>(null);
const modes: {
value: AudioMode;
label: string;
sub: string;
pattern: 'flat' | 'medium' | 'aggressive' | 'auto';
}[] = [
{ value: 'none', label: 'None', sub: 'Raw', pattern: 'flat' },
{ value: 'standard', label: 'Standard', sub: 'Balanced', pattern: 'medium' },
{ value: 'aggressive', label: 'Aggressive', sub: 'Noisy sources', pattern: 'aggressive' },
{ value: 'auto', label: 'Auto', sub: 'Detect', pattern: 'auto' }
];
onMount(async () => {
await loadJobs();
setInterval(loadJobs, 5000);
});
async function loadJobs() {
try {
const res = await fetch('/api/jobs');
if (res.ok) jobs = await res.json();
} catch { /* ignore */ }
}
async function submit(formData: FormData) {
submitting = true;
error = '';
try {
const res = await fetch('/api/jobs', { method: 'POST', body: formData });
if (!res.ok) throw new Error(await res.text());
const { id } = await res.json();
window.location.href = `/jobs/${id}`;
} catch (e) {
error = e instanceof Error ? e.message : String(e);
submitting = false;
}
}
async function submitUrl() {
if (!url.trim()) return;
const fd = new FormData();
fd.append('url', url.trim());
fd.append('audioMode', audioMode);
await submit(fd);
}
async function handleFile(file: File) {
const fd = new FormData();
fd.append('file', file);
fd.append('audioMode', audioMode);
await submit(fd);
}
async function handleRecording(blob: Blob, filename: string) {
const fd = new FormData();
fd.append('file', new File([blob], filename, { type: blob.type }));
fd.append('audioMode', audioMode);
await submit(fd);
}
function onDrop(e: DragEvent) {
e.preventDefault();
dragOver = false;
const file = e.dataTransfer?.files[0];
if (file) handleFile(file);
}
function jobKind(job: Job): 'youtube' | 'audio' | 'video' | 'file' {
const s = job.source ?? '';
if (s.includes('youtube') || s.includes('youtu.be')) return 'youtube';
if (/\.(mp3|m4a|wav|ogg|flac|aac)$/i.test(s)) return 'audio';
if (/\.(mp4|mov|mkv|webm|avi)$/i.test(s)) return 'video';
return 'file';
}
function jobMeta(job: Job): string {
const parts: string[] = [];
if (job.source && !job.source.startsWith('http')) parts.push(job.source.split('/').pop() ?? '');
if (job.audioMode) parts.push(job.audioMode);
if (job.status === 'done') parts.push('done');
else parts.push(job.status);
return parts.join(' · ');
}
// Decorative waveform bars for the drop zone (80 bars)
const DROPZONE_BARS = 80;
</script>
<svelte:head>
<title>Tonemark</title>
</svelte:head>
<div class="page">
<!-- ── Header ─────────────────────────────────────────── -->
<header class="page-header">
<div>
<div class="label">New transcription</div>
<h1 class="page-title">What would you like to transcribe?</h1>
</div>
</header>
<!-- ── Input row: drop + url ──────────────────────────── -->
<div class="input-row">
<!-- 01 · Drop zone -->
<div
class="dropzone glass"
class:drag-over={dragOver}
ondragover={(e) => { e.preventDefault(); dragOver = true; }}
ondragleave={() => (dragOver = false)}
ondrop={onDrop}
role="button"
tabindex="0"
aria-label="Drop audio or video file"
onkeydown={(e) => e.key === 'Enter' && fileInput?.click()}
onclick={() => fileInput?.click()}
>
<div class="dropzone-text">
<div class="label" style="color: {ACCENT}; margin-bottom: 14px;">01 · Drop a file</div>
<div class="dropzone-headline">
Drop audio or video here, or <span class="browse-link" style="color: {ACCENT}; border-bottom: 1px dashed {ACCENT};">browse</span>
</div>
<div class="dropzone-hint mono">
.mp3 .m4a .wav .mp4 .mov .webm — up to 4 GB
</div>
</div>
<!-- Decorative waveform -->
<div class="dropzone-wave">
<Waveform bars={DROPZONE_BARS} progress={0} {ACCENT} height={38} />
</div>
<input
bind:this={fileInput}
type="file"
accept="video/*,audio/*"
class="sr-only"
onchange={(e) => {
const f = (e.target as HTMLInputElement).files?.[0];
if (f) handleFile(f);
}}
/>
</div>
<!-- 02 · URL input -->
<div class="url-card glass">
<div class="label" style="margin-bottom: 14px;">02 · Or paste a URL</div>
<div class="url-input-row">
<svg width="16" height="16" viewBox="0 0 16 16" fill="none" style="flex-shrink:0; color: rgba(232,233,236,0.5)">
<rect x="2" y="4" width="12" height="8" rx="1.5" stroke="currentColor" stroke-width="1.3"/>
<path d="M7 6.5L10 8L7 9.5V6.5Z" fill="currentColor"/>
</svg>
<input
type="url"
bind:value={url}
placeholder="youtube.com/watch?v=…"
class="url-input mono"
onkeydown={(e) => e.key === 'Enter' && submitUrl()}
aria-label="Video or YouTube URL"
/>
</div>
<div class="url-footer">
<span class="mono" style="font-size:12px; color: var(--text-muted);">
YouTube · Vimeo · Loom · direct .mp4
</span>
<button
class="btn-fetch"
style="background: {ACCENT}; color: #0c0d10;"
onclick={submitUrl}
disabled={submitting || !url.trim()}
>
{submitting ? 'Starting…' : 'Fetch'}
{#if !submitting}
<svg width="12" height="12" viewBox="0 0 12 12" fill="none">
<path d="M2 6h7M6 3l3 3-3 3" stroke="#0c0d10" stroke-width="1.6" stroke-linecap="round" stroke-linejoin="round"/>
</svg>
{/if}
</button>
</div>
</div>
</div>
<!-- 03 · Record audio -->
<div class="record-card glass">
<div class="label" style="margin-bottom: 4px;">03 · Record audio</div>
<div class="record-sub">Tap record, speak, and we'll transcribe on stop.</div>
<RecordButton
accent={ACCENT}
{audioMode}
ondone={handleRecording}
/>
</div>
{#if error}
<div class="error-banner" role="alert">{error}</div>
{/if}
<!-- ── 04 · Audio processing modes ────────────────────── -->
<section class="modes-section glass">
<div class="modes-header">
<div>
<div class="label" style="margin-bottom: 4px;">04 · Audio processing</div>
<div style="font-size: 14px; color: var(--text-muted);">
How aggressive should noise reduction &amp; normalisation be?
</div>
</div>
<span class="auto-badge" style="background: color-mix(in oklab, {ACCENT} 12%, transparent); color: {ACCENT};">
Auto recommended
</span>
</div>
<div class="modes-grid">
{#each modes as m}
{@const on = audioMode === m.value}
<button
class="mode-card"
class:active={on}
style={on
? `background: color-mix(in oklab, ${ACCENT} 12%, transparent); border-color: color-mix(in oklab, ${ACCENT} 50%, transparent);`
: ''}
onclick={() => (audioMode = m.value)}
aria-pressed={on}
>
<div class="mode-wave">
<Waveform bars={17} height={14} pattern={m.pattern} accent={ACCENT} progress={on ? 100 : 0} />
</div>
<div class="mode-label" class:active-label={on}>{m.label}</div>
<div class="mode-sub mono">{m.sub}</div>
</button>
{/each}
</div>
</section>
<!-- ── Recent jobs ─────────────────────────────────────── -->
{#if jobs.length > 0}
<section>
<div class="recents-header">
<h2 class="recents-title">Recent transcriptions</h2>
<span class="mono" style="font-size: 12px; color: var(--text-muted);">{jobs.length} total</span>
</div>
<div class="glass recents-list">
{#each jobs as job, i}
<a
href="/jobs/{job.id}"
class="recent-item"
class:first={i === 0}
>
<SourceIcon kind={jobKind(job)} size={36} accent={ACCENT} />
<div class="recent-text">
<div class="recent-title">{job.title || job.source}</div>
<div class="recent-meta mono">{jobMeta(job)}</div>
</div>
{#if job.status !== 'done' && job.status !== 'failed' && job.status !== 'cancelled'}
<div class="recent-progress mono" style="color: {ACCENT}">{job.progress}%</div>
{/if}
<svg width="14" height="14" viewBox="0 0 14 14" style="color: var(--text-dim); flex-shrink:0">
<path d="M5 3l4 4-4 4" stroke="currentColor" stroke-width="1.4" fill="none" stroke-linecap="round" stroke-linejoin="round"/>
</svg>
</a>
{/each}
</div>
</section>
{/if}
</div>
<style>
.page {
padding: 32px 40px;
display: flex;
flex-direction: column;
gap: 22px;
max-width: 1000px;
}
.page-header {
display: flex;
justify-content: space-between;
align-items: flex-start;
margin-bottom: 6px;
}
.page-title {
margin: 6px 0 0;
font-size: 32px;
font-weight: 600;
letter-spacing: -0.02em;
}
/* ── Input row ─────────────────────────────────────────── */
.input-row {
display: grid;
grid-template-columns: 1.4fr 1fr;
gap: 20px;
}
/* ── Drop zone ─────────────────────────────────────────── */
.dropzone {
padding: 24px;
cursor: pointer;
display: flex;
flex-direction: column;
justify-content: space-between;
min-height: 200px;
background: linear-gradient(
180deg,
rgba(205, 242, 78, 0.04),
rgba(255, 255, 255, 0.015)
) !important;
border: 1px dashed rgba(205, 242, 78, 0.3) !important;
transition: border-color 0.2s, background 0.2s;
}
.dropzone.drag-over {
border-color: rgba(205, 242, 78, 0.7) !important;
background: rgba(205, 242, 78, 0.07) !important;
}
.dropzone-headline {
font-size: 20px;
font-weight: 500;
letter-spacing: -0.02em;
line-height: 1.3;
margin-bottom: 8px;
}
.browse-link {
padding-bottom: 1px;
}
.dropzone-hint {
font-size: 12px;
color: var(--text-muted);
}
.dropzone-wave {
margin-top: 16px;
}
/* ── URL card ──────────────────────────────────────────── */
.url-card {
padding: 22px;
display: flex;
flex-direction: column;
gap: 12px;
}
.url-input-row {
display: flex;
align-items: center;
gap: 10px;
padding: 12px 14px;
border-radius: 10px;
background: rgba(0, 0, 0, 0.25);
border: 1px solid var(--border);
}
.url-input {
flex: 1;
background: none;
border: none;
outline: none;
font-size: 13px;
color: var(--text);
min-width: 0;
}
.url-input::placeholder {
color: var(--text-muted);
}
.url-footer {
display: flex;
align-items: center;
justify-content: space-between;
gap: 10px;
}
.btn-fetch {
padding: 9px 16px;
border-radius: 10px;
border: none;
font-family: inherit;
font-size: 13px;
font-weight: 600;
cursor: pointer;
display: flex;
align-items: center;
gap: 6px;
white-space: nowrap;
transition: filter 0.15s;
}
.btn-fetch:hover:not(:disabled) {
filter: brightness(1.1);
}
.btn-fetch:disabled {
opacity: 0.5;
cursor: not-allowed;
}
/* ── Record card ───────────────────────────────────────── */
.record-card {
padding: 22px;
display: flex;
flex-direction: column;
gap: 14px;
}
.record-sub {
font-size: 14px;
color: var(--text-muted);
margin-top: -10px;
}
/* ── Error ─────────────────────────────────────────────── */
.error-banner {
padding: 12px 16px;
border-radius: 10px;
background: rgba(255, 90, 90, 0.08);
border: 1px solid rgba(255, 90, 90, 0.2);
color: #ff8a8a;
font-size: 13px;
}
/* ── Audio modes ───────────────────────────────────────── */
.modes-section {
padding: 22px;
}
.modes-header {
display: flex;
justify-content: space-between;
align-items: flex-start;
margin-bottom: 16px;
}
.auto-badge {
padding: 4px 10px;
border-radius: 6px;
font-size: 11px;
font-family: var(--font-mono);
white-space: nowrap;
}
.modes-grid {
display: grid;
grid-template-columns: repeat(4, 1fr);
gap: 10px;
}
.mode-card {
padding: 14px 14px 12px;
border-radius: 12px;
background: rgba(255, 255, 255, 0.025);
border: 1px solid var(--border);
cursor: pointer;
text-align: left;
transition: background 0.15s, border-color 0.15s;
}
.mode-card:hover {
background: rgba(255, 255, 255, 0.04);
}
.mode-wave {
margin-bottom: 8px;
}
.mode-label {
font-size: 13px;
font-weight: 600;
color: rgba(232, 233, 236, 0.85);
margin-bottom: 2px;
}
.mode-label.active-label {
color: #fff;
}
.mode-sub {
font-size: 11px;
color: var(--text-muted);
}
/* ── Recents ───────────────────────────────────────────── */
.recents-header {
display: flex;
justify-content: space-between;
align-items: baseline;
margin-bottom: 12px;
}
.recents-title {
margin: 0;
font-size: 16px;
font-weight: 600;
letter-spacing: -0.01em;
}
.recents-list {
padding: 0;
}
.recent-item {
display: flex;
align-items: center;
gap: 16px;
padding: 14px 18px;
border-top: 1px solid var(--border);
text-decoration: none;
color: var(--text);
transition: background 0.12s;
}
.recent-item.first {
border-top: none;
}
.recent-item:hover {
background: rgba(255, 255, 255, 0.025);
}
.recent-text {
flex: 1;
min-width: 0;
}
.recent-title {
font-size: 14px;
font-weight: 500;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
margin-bottom: 3px;
}
.recent-meta {
font-size: 11.5px;
color: var(--text-muted);
}
.recent-progress {
font-size: 11.5px;
min-width: 36px;
text-align: right;
}
/* ── Utilities ─────────────────────────────────────────── */
.sr-only {
position: absolute;
width: 1px;
height: 1px;
padding: 0;
overflow: hidden;
clip: rect(0, 0, 0, 0);
white-space: nowrap;
border: 0;
}
/* ── Responsive ────────────────────────────────────────── */
@media (max-width: 768px) {
.page {
padding: 20px 16px;
}
.page-title {
font-size: 24px;
}
.input-row {
grid-template-columns: 1fr;
}
.modes-grid {
grid-template-columns: repeat(2, 1fr);
}
}
</style>

View File

@@ -0,0 +1,45 @@
import { json, error } from '@sveltejs/kit';
import { createJob, listJobs } from '$lib/server/db.js';
import { startYouTubeJob, startUploadJob } from '$lib/server/pipeline.js';
import type { AudioMode } from '$lib/types.js';
export async function GET() {
return json(listJobs());
}
export async function POST({ request }) {
const contentType = request.headers.get('content-type') ?? '';
let url: string | null = null;
let audioMode: AudioMode = 'auto';
let language: string | undefined;
let fileBuffer: Buffer | null = null;
let filename = 'upload';
if (contentType.includes('application/json')) {
const body = await request.json();
url = body.url ?? null;
audioMode = body.audioMode ?? 'auto';
language = body.language;
} else if (contentType.includes('multipart/form-data')) {
const form = await request.formData();
url = form.get('url')?.toString() ?? null;
audioMode = (form.get('audioMode')?.toString() as AudioMode) ?? 'auto';
language = form.get('language')?.toString();
const file = form.get('file');
if (file instanceof File) {
fileBuffer = Buffer.from(await file.arrayBuffer());
filename = file.name;
}
} else {
throw error(415, 'Unsupported content type');
}
if (!url && !fileBuffer) throw error(400, 'Provide url or file');
const jobId = url
? await startYouTubeJob(url, audioMode, language)
: await startUploadJob(fileBuffer!, filename, audioMode, language);
return json({ id: jobId }, { status: 201 });
}

View File

@@ -0,0 +1,18 @@
import { json, error } from '@sveltejs/kit';
import { getJob, setJobStatus } from '$lib/server/db.js';
export async function GET({ params }) {
const job = getJob(params.id);
if (!job) throw error(404, 'Job not found');
return json(job);
}
export async function DELETE({ params }) {
const job = getJob(params.id);
if (!job) throw error(404, 'Job not found');
if (job.status === 'done' || job.status === 'failed') {
throw error(409, 'Job already completed');
}
setJobStatus(params.id, 'cancelled', 0);
return new Response(null, { status: 204 });
}

View File

@@ -0,0 +1,37 @@
import { error } from '@sveltejs/kit';
import { getJob } from '$lib/server/db.js';
import { existsSync } from 'fs';
import { readFile } from 'fs/promises';
import { join } from 'path';
const MIME: Record<string, string> = {
srt: 'text/plain',
txt: 'text/plain',
md: 'text/markdown',
json: 'application/json'
};
export async function GET({ params }) {
const { id, format } = params;
if (!MIME[format]) throw error(400, `Unknown format: ${format}`);
const job = getJob(id);
if (!job) throw error(404, 'Job not found');
if (job.status !== 'done') throw error(409, 'Transcript not ready yet');
if (!job.outputDir) throw error(500, 'Output directory not set');
const safeTitle =
(job.title ?? id).replace(/[^\w\s-]/g, '').replace(/\s+/g, '_').slice(0, 80) || id;
const filePath = join(job.outputDir, `${safeTitle}.${format}`);
if (!existsSync(filePath)) throw error(404, `${format} file not found`);
const content = await readFile(filePath);
return new Response(content.buffer as ArrayBuffer, {
headers: {
'Content-Type': MIME[format],
'Content-Disposition': `attachment; filename="${safeTitle}.${format}"`
}
});
}

View File

@@ -0,0 +1,34 @@
import { json, error } from '@sveltejs/kit';
import { getJob, updateJob } from '$lib/server/db.js';
import { deduplicateSegments } from '$lib/server/postprocess.js';
import { writeOutputs } from '$lib/server/formatter.js';
import type { Segment } from '$lib/types.js';
/** POST /api/jobs/[id]/reprocess — re-run post-processing and regenerate all output files. */
export async function POST({ params }) {
const job = getJob(params.id);
if (!job) throw error(404, 'Job not found');
if (!job.segmentsJson) {
throw error(422, 'No segments stored for this job — cannot reprocess');
}
try {
const rawSegments = JSON.parse(job.segmentsJson) as Segment[];
const segments = deduplicateSegments(rawSegments);
const paths = await writeOutputs(segments, job.title, job.id);
const outputDir = paths.srt.replace(/\/[^/]+$/, '');
updateJob({
id: job.id,
segmentsJson: JSON.stringify(segments),
outputDir
});
return json({ ok: true, paths });
} catch (err: unknown) {
const message = err instanceof Error ? err.message : String(err);
throw error(500, `Reprocess failed: ${message}`);
}
}

View File

@@ -0,0 +1,51 @@
import { error } from '@sveltejs/kit';
import { getJob } from '$lib/server/db.js';
import { subscribeProgress } from '$lib/server/pipeline.js';
export async function GET({ params, request }) {
const job = getJob(params.id);
if (!job) throw error(404, 'Job not found');
const stream = new ReadableStream({
start(controller) {
const enc = new TextEncoder();
function send(data: string) {
controller.enqueue(enc.encode(`data: ${data}\n\n`));
}
// Send current status immediately
send(JSON.stringify({ type: 'status', status: job.status, progress: job.progress }));
if (job.status === 'done' || job.status === 'failed' || job.status === 'cancelled') {
send(JSON.stringify({ type: 'done', status: job.status }));
controller.close();
return;
}
const unsub = subscribeProgress(params.id, (data) => {
send(data);
try {
const parsed = JSON.parse(data);
if (parsed.type === 'done' || parsed.type === 'error') {
controller.close();
unsub();
}
} catch { /* ignore */ }
});
request.signal.addEventListener('abort', () => {
unsub();
controller.close();
});
}
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
Connection: 'keep-alive'
}
});
}

View File

@@ -0,0 +1,16 @@
import { json, error } from '@sveltejs/kit';
import { savePushSubscription, getAllSubscriptions } from '$lib/server/db.js';
import { getVapidPublicKey } from '$lib/server/push.js';
export async function GET() {
const key = getVapidPublicKey();
if (!key) throw error(503, 'Push notifications not configured (VAPID keys missing)');
return json({ publicKey: key });
}
export async function POST({ request }) {
const { endpoint, keys } = await request.json();
if (!endpoint || !keys?.p256dh || !keys?.auth) throw error(400, 'Invalid subscription');
savePushSubscription({ endpoint, p256dh: keys.p256dh, auth: keys.auth });
return new Response(null, { status: 201 });
}

View File

@@ -0,0 +1,54 @@
import { json, error } from '@sveltejs/kit';
import { getJob, updateJob, setJobStatus } from '$lib/server/db.js';
import { deduplicateSegments } from '$lib/server/postprocess.js';
import { writeOutputs } from '$lib/server/formatter.js';
import { sendNotification } from '$lib/server/push.js';
import { cleanupJobTmp } from '$lib/server/downloader.js';
import { emitProgress } from '$lib/server/pipeline.js';
import type { Segment, WhisperJob } from '$lib/types.js';
export async function POST({ params, request }) {
const jobId = params.jobId;
const job = getJob(jobId);
if (!job) throw error(404, 'Job not found');
const whisperJob = (await request.json()) as WhisperJob;
if (whisperJob.status === 'failed' || whisperJob.status === 'cancelled') {
const msg = whisperJob.error ?? `Whisper job ${whisperJob.status}`;
updateJob({ id: jobId, status: 'failed', error: msg });
emitProgress(jobId, { type: 'error', message: msg });
return json({ ok: true });
}
try {
setJobStatus(jobId, 'processing', 90);
emitProgress(jobId, { type: 'status', status: 'processing', progress: 90 });
const rawSegments = whisperJob.segments as Segment[];
const segments = deduplicateSegments(rawSegments);
const paths = await writeOutputs(segments, job.title, jobId);
const outputDir = paths.srt.replace(/\/[^/]+$/, '');
updateJob({
id: jobId,
status: 'done',
progress: 100,
segmentsJson: JSON.stringify(segments),
outputDir
});
emitProgress(jobId, { type: 'done', status: 'done' });
await sendNotification(jobId, '✅ Transcript ready', job.title);
await cleanupJobTmp(jobId);
return json({ ok: true });
} catch (err: unknown) {
const message = err instanceof Error ? err.message : String(err);
updateJob({ id: jobId, status: 'failed', error: message });
emitProgress(jobId, { type: 'error', message });
return json({ ok: false, error: message }, { status: 500 });
}
}

View File

@@ -0,0 +1,209 @@
<script lang="ts">
import { onMount } from 'svelte';
import type { Job } from '$lib/types.js';
import SourceIcon from '$lib/components/SourceIcon.svelte';
import Waveform from '$lib/components/Waveform.svelte';
const ACCENT = '#cdf24e';
let jobs = $state<Job[]>([]);
let loading = $state(true);
const statusColor: Record<string, string> = {
done: '#cdf24e',
failed: '#ff6b6b',
cancelled: 'rgba(232,233,236,0.3)',
transcribing: '#80c7f7',
preparing: '#fbc94b',
downloading: '#a78bfa',
pending: 'rgba(232,233,236,0.4)'
};
const statusLabel: Record<string, string> = {
pending: 'Pending',
downloading: 'Downloading',
preparing: 'Preparing',
transcribing: 'Transcribing',
processing: 'Processing',
done: 'Done',
failed: 'Failed',
cancelled: 'Cancelled'
};
function jobKind(job: Job): 'youtube' | 'audio' | 'video' | 'file' {
const s = job.source ?? '';
if (s.includes('youtube') || s.includes('youtu.be')) return 'youtube';
if (/\.(mp3|m4a|wav|ogg|flac|aac)$/i.test(s)) return 'audio';
if (/\.(mp4|mov|mkv|webm|avi)$/i.test(s)) return 'video';
return 'file';
}
function relativeTime(iso: string): string {
const diff = Date.now() - new Date(iso).getTime();
if (diff < 60_000) return 'just now';
if (diff < 3_600_000) return `${Math.floor(diff / 60_000)}m ago`;
if (diff < 86_400_000) return `${Math.floor(diff / 3_600_000)}h ago`;
return `${Math.floor(diff / 86_400_000)}d ago`;
}
onMount(async () => {
const res = await fetch('/api/jobs');
if (res.ok) jobs = await res.json();
loading = false;
});
</script>
<svelte:head>
<title>Queue — Tonemark</title>
</svelte:head>
<div class="page">
<h1 class="page-title">Queue</h1>
{#if loading}
<div class="loading">Loading…</div>
{:else if jobs.length === 0}
<div class="glass empty">
<svg width="32" height="32" viewBox="0 0 32 32" fill="none" style="opacity:.4">
<rect x="6" y="14" width="3" height="12" rx="1.5" fill="currentColor"/>
<rect x="11" y="9" width="3" height="17" rx="1.5" fill="currentColor"/>
<rect x="16" y="6" width="3" height="20" rx="1.5" fill="currentColor"/>
<rect x="21" y="11" width="3" height="15" rx="1.5" fill="currentColor"/>
<rect x="26" y="16" width="3" height="10" rx="1.5" fill="currentColor"/>
</svg>
<p>No transcription jobs yet.</p>
<a href="/" class="link">Start one →</a>
</div>
{:else}
<div class="job-list">
{#each jobs as job}
<a href="/jobs/{job.id}" class="glass job-row">
<SourceIcon kind={jobKind(job)} size={36} accent={ACCENT} />
<div class="job-info">
<div class="job-name">{job.title || job.source}</div>
<div class="job-meta mono">
<span style="color: {statusColor[job.status] ?? 'rgba(232,233,236,0.5)'}">
{statusLabel[job.status] ?? job.status}
</span>
{#if job.createdAt}
<span>·</span>
<span>{relativeTime(job.createdAt)}</span>
{/if}
{#if job.audioMode}
<span>·</span>
<span>{job.audioMode}</span>
{/if}
</div>
</div>
{#if !['done', 'failed', 'cancelled'].includes(job.status)}
<div class="job-wave">
<Waveform bars={40} progress={job.progress} accent={ACCENT} height={28} pattern="medium" />
</div>
{:else if job.status === 'done'}
<div class="job-pct mono" style="color: {ACCENT}">{job.progress}%</div>
{:else}
<div class="job-pct mono" style="color: {statusColor[job.status]}">{job.status}</div>
{/if}
</a>
{/each}
</div>
{/if}
</div>
<style>
.page {
padding: 32px 40px;
display: flex;
flex-direction: column;
gap: 20px;
max-width: 900px;
}
.page-title {
font-size: 28px;
font-weight: 600;
letter-spacing: -0.02em;
margin: 0;
}
.loading {
color: var(--text-muted);
font-size: 14px;
}
.empty {
padding: 48px;
display: flex;
flex-direction: column;
align-items: center;
gap: 12px;
text-align: center;
color: var(--text-muted);
font-size: 14px;
}
.link {
color: var(--accent);
text-decoration: none;
font-size: 13px;
}
.job-list {
display: flex;
flex-direction: column;
gap: 8px;
}
.job-row {
display: flex;
align-items: center;
gap: 14px;
padding: 16px 20px;
text-decoration: none;
color: var(--text);
transition: background 0.15s;
}
.job-row:hover {
background: rgba(255, 255, 255, 0.04);
}
.job-info {
flex: 1;
min-width: 0;
}
.job-name {
font-size: 14px;
font-weight: 500;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
margin-bottom: 3px;
}
.job-meta {
font-size: 11.5px;
color: var(--text-muted);
display: flex;
gap: 6px;
}
.job-wave {
flex-shrink: 0;
}
.job-pct {
font-size: 12.5px;
font-weight: 600;
flex-shrink: 0;
}
@media (max-width: 768px) {
.page {
padding: 20px 16px;
}
}
</style>

View File

@@ -0,0 +1,615 @@
<script lang="ts">
import { onMount, onDestroy } from 'svelte';
import { page } from '$app/stores';
import type { Job, Segment } from '$lib/types.js';
import SourceIcon from '$lib/components/SourceIcon.svelte';
import Waveform from '$lib/components/Waveform.svelte';
const ACCENT = '#cdf24e';
const jobId = $derived($page.params.id);
let job = $state<Job | null>(null);
let segments = $state<Segment[]>([]);
let error = $state('');
let chunkInfo = $state({ chunk: 0, total: 0 });
let eventSource: EventSource | null = null;
const statusLabel: Record<string, string> = {
pending: 'Pending',
downloading: 'Downloading…',
preparing: 'Preparing audio…',
transcribing: 'Transcribing…',
processing: 'Post-processing…',
done: 'Done',
failed: 'Failed',
cancelled: 'Cancelled'
};
// Pipeline stages derived from job status
const pipelineStages = $derived.by(() => {
const status = job?.status ?? 'pending';
const stages = [
{ k: 'fetch', label: 'Fetch source' },
{ k: 'extract', label: 'Extract audio track' },
{ k: 'process', label: `Audio processing · ${job?.audioMode ?? 'auto'}` },
{ k: 'transcribe', label: 'Transcribing' },
{ k: 'finalize', label: 'Format &amp; save' }
];
const order = ['pending', 'downloading', 'preparing', 'transcribing', 'processing', 'done'];
const idx = order.indexOf(status);
return stages.map((s, i) => ({
...s,
done: i < idx - 1 || status === 'done',
active: i === idx - 1 && status !== 'done' && status !== 'failed',
pending: i > idx - 1 && status !== 'done'
}));
});
function jobKind(job: Job): 'youtube' | 'audio' | 'video' | 'file' {
const s = job.source ?? '';
if (s.includes('youtube') || s.includes('youtu.be')) return 'youtube';
if (/\.(mp3|m4a|wav|ogg|flac|aac)$/i.test(s)) return 'audio';
if (/\.(mp4|mov|mkv|webm|avi)$/i.test(s)) return 'video';
return 'file';
}
onMount(async () => {
await loadJob();
if (job && !['done', 'failed', 'cancelled'].includes(job.status)) {
openStream();
}
});
onDestroy(() => eventSource?.close());
async function loadJob() {
const res = await fetch(`/api/jobs/${jobId}`);
if (!res.ok) {
error = 'Job not found';
return;
}
job = await res.json();
if (job?.segmentsJson) {
try {
segments = JSON.parse(job.segmentsJson);
} catch { /* ignore */ }
}
}
function openStream() {
eventSource = new EventSource(`/api/jobs/${jobId}/stream`);
eventSource.onmessage = (e) => {
try {
const data = JSON.parse(e.data);
if (data.type === 'progress') {
chunkInfo = { chunk: data.chunk ?? 0, total: data.total ?? 0 };
if (job) job = { ...job, progress: data.progress ?? job.progress, status: 'transcribing' };
} else if (data.type === 'status') {
if (job) job = { ...job, status: data.status, progress: data.progress ?? job.progress };
} else if (data.type === 'done') {
eventSource?.close();
loadJob();
} else if (data.type === 'error') {
if (job) job = { ...job, status: 'failed', error: data.message };
eventSource?.close();
}
} catch { /* ignore */ }
};
}
function secToTimestamp(sec: number): string {
const h = Math.floor(sec / 3600);
const m = Math.floor((sec % 3600) / 60);
const s = Math.floor(sec % 60);
return h > 0
? `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(s).padStart(2, '0')}`
: `${String(m).padStart(2, '0')}:${String(s).padStart(2, '0')}`;
}
const formats = ['srt', 'txt', 'md', 'json'] as const;
const isActive = $derived(!job || !['done', 'failed', 'cancelled'].includes(job.status));
</script>
<svelte:head>
<title>{job?.title ?? 'Job'} — Tonemark</title>
</svelte:head>
<div class="page">
{#if error}
<div class="error-banner" role="alert">{error}</div>
{:else if !job}
<div class="loading" aria-busy="true">
<svg width="20" height="20" viewBox="0 0 20 20" style="animation: spin 1s linear infinite">
<circle cx="10" cy="10" r="8" stroke="var(--text-muted)" stroke-width="2" fill="none" stroke-dasharray="30 14"/>
</svg>
Loading…
</div>
{:else}
<!-- ── Breadcrumb ─────────────────────────────────────── -->
<div class="breadcrumb mono">
<a href="/" class="crumb-link">Home</a>
<span></span>
<span style="color: #fff">{job.id.slice(0, 8)}</span>
</div>
<!-- ── Job header ────────────────────────────────────── -->
<div class="job-header">
<SourceIcon kind={jobKind(job)} size={52} accent={ACCENT} />
<div class="job-header-text">
<h1 class="job-title">{job.title || job.source}</h1>
<div class="job-meta mono">
{job.source?.includes('http') ? job.source : (job.source ?? '')}
{#if job.audioMode}· {job.audioMode}{/if}
{#if job.meanVolume != null}· {job.meanVolume.toFixed(1)} dBFS{/if}
</div>
</div>
{#if isActive}
<form method="POST" action="/api/jobs/{job.id}?_method=DELETE">
<button type="button" class="btn-cancel" aria-label="Cancel job">Cancel</button>
</form>
{/if}
</div>
<!-- ── Progress block ────────────────────────────────── -->
{#if isActive || job.status === 'done'}
<div class="progress-card glass">
<!-- Waveform coloured by progress -->
<div class="progress-wave">
<Waveform
bars={140}
progress={job.progress}
accent={ACCENT}
height={80}
pattern="default"
/>
</div>
<div class="progress-footer">
<div class="progress-left">
<span class="progress-pct mono">
{job.progress}<span style="color: var(--text-dim); font-weight: 400">%</span>
</span>
<span class="progress-status">{statusLabel[job.status] ?? job.status}</span>
</div>
{#if chunkInfo.total > 0}
<span class="progress-chunks mono">
chunk {chunkInfo.chunk} / {chunkInfo.total}
</span>
{/if}
</div>
<!-- Progress bar -->
<div class="progress-bar-track">
<div
class="progress-bar-fill"
style="width: {job.progress}%; background: {ACCENT}; box-shadow: 0 0 12px {ACCENT}80;"
></div>
</div>
</div>
{/if}
<!-- ── Error ─────────────────────────────────────────── -->
{#if job.error}
<div class="error-banner" role="alert">{job.error}</div>
{/if}
<!-- ── Two-column: pipeline + downloads/transcript ───── -->
<div class="two-col">
<!-- Pipeline stages -->
<div class="glass stage-card">
<div class="label" style="margin-bottom: 16px;">Pipeline</div>
<div class="stages">
{#each pipelineStages as stage}
<div class="stage-row">
<div
class="stage-dot"
style={stage.done
? `background: ${ACCENT};`
: stage.active
? `background: transparent; border: 2px solid ${ACCENT};`
: 'background: rgba(255,255,255,0.05);'}
>
{#if stage.done}
<svg width="10" height="10" viewBox="0 0 10 10" fill="none">
<path d="M2 5l2 2 4-4" stroke="#0c0d10" stroke-width="1.6" stroke-linecap="round" stroke-linejoin="round"/>
</svg>
{:else if stage.active}
<div class="stage-dot-inner" style="background: {ACCENT}"></div>
{/if}
</div>
<span
class="stage-label"
style={stage.pending ? 'color: var(--text-dim)' : stage.active ? 'color: #fff; font-weight: 500' : ''}
>
{@html stage.label}
</span>
{#if stage.active}
<span class="mono" style="font-size: 11.5px; color: {ACCENT}">{job.progress}%</span>
{/if}
</div>
{/each}
</div>
</div>
<!-- Downloads or live preview -->
<div class="glass side-card">
{#if job.status === 'done'}
<div class="label" style="margin-bottom: 16px;">Download transcript</div>
<div class="dl-grid">
{#each formats as fmt, i}
<a
href="/api/jobs/{job.id}/download/{fmt}"
download
class="dl-btn mono"
style={i === 0
? `background: color-mix(in oklab, ${ACCENT} 12%, transparent); color: ${ACCENT}; border-color: color-mix(in oklab, ${ACCENT} 30%, transparent);`
: ''}
>
<svg width="11" height="11" viewBox="0 0 11 11" fill="none">
<path d="M5.5 1v7M2 5l3.5 3.5L9 5M1.5 9.5h8" stroke="currentColor" stroke-width="1.4" stroke-linecap="round" stroke-linejoin="round"/>
</svg>
{fmt.toUpperCase()}
</a>
{/each}
</div>
{#if job.outputDir}
<div class="output-dir mono">{job.outputDir}</div>
{/if}
{:else if isActive}
<div class="live-header">
<div class="label">Live preview</div>
<div class="streaming-badge" style="color: {ACCENT}">
<div class="stream-dot" style="background: {ACCENT}; animation: pulse 1.4s infinite"></div>
Streaming
</div>
</div>
{#if segments.length > 0}
{@const last = segments[segments.length - 1]}
<div class="live-text">
<span class="mono" style="color: var(--text-dim); margin-right: 8px;">
{secToTimestamp(last.start)}
</span>
{last.text}<span style="color: {ACCENT}; animation: blink 1s infinite; margin-left: 3px;"></span>
</div>
{:else}
<div style="font-size: 13px; color: var(--text-muted); font-style: italic;">
Waiting for segments…
</div>
{/if}
{/if}
</div>
</div>
<!-- ── Transcript viewer ──────────────────────────────── -->
{#if segments.length > 0}
<section class="glass transcript-card">
<div class="transcript-header">
<div class="label">Transcript</div>
<span class="mono" style="font-size: 12px; color: var(--text-muted);">
{segments.length} segments
</span>
</div>
<div class="transcript-body">
{#each segments as seg}
<div class="seg-row">
<span class="seg-ts mono">{secToTimestamp(seg.start)}</span>
<p class="seg-text">{seg.text}</p>
</div>
{/each}
</div>
</section>
{/if}
{/if}
</div>
<style>
.page {
padding: 32px 40px;
display: flex;
flex-direction: column;
gap: 20px;
max-width: 1000px;
}
/* ── Loading / errors ──────────────────────────────────── */
.loading {
display: flex;
align-items: center;
gap: 10px;
color: var(--text-muted);
font-size: 14px;
}
.error-banner {
padding: 12px 16px;
border-radius: 10px;
background: rgba(255, 90, 90, 0.08);
border: 1px solid rgba(255, 90, 90, 0.2);
color: #ff8a8a;
font-size: 13px;
}
/* ── Breadcrumb ─────────────────────────────────────────── */
.breadcrumb {
display: flex;
align-items: center;
gap: 8px;
font-size: 12px;
color: var(--text-muted);
}
.crumb-link {
color: var(--text-muted);
text-decoration: none;
}
.crumb-link:hover {
color: var(--text);
}
/* ── Job header ─────────────────────────────────────────── */
.job-header {
display: flex;
align-items: center;
gap: 18px;
}
.job-header-text {
flex: 1;
min-width: 0;
}
.job-title {
margin: 0 0 4px;
font-size: 26px;
font-weight: 600;
letter-spacing: -0.02em;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.job-meta {
font-size: 12.5px;
color: var(--text-muted);
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.btn-cancel {
padding: 8px 14px;
border-radius: 8px;
border: 1px solid rgba(255, 90, 90, 0.3);
background: rgba(255, 90, 90, 0.08);
color: #ff8a8a;
font-size: 12.5px;
font-family: inherit;
cursor: pointer;
white-space: nowrap;
}
.btn-cancel:hover {
background: rgba(255, 90, 90, 0.15);
}
/* ── Progress card ──────────────────────────────────────── */
.progress-card {
padding: 28px;
}
.progress-wave {
margin-bottom: 20px;
}
.progress-footer {
display: flex;
justify-content: space-between;
align-items: baseline;
margin-bottom: 8px;
}
.progress-left {
display: flex;
align-items: baseline;
gap: 12px;
}
.progress-pct {
font-size: 36px;
font-weight: 600;
letter-spacing: -0.02em;
}
.progress-status {
font-size: 14px;
color: var(--text-muted);
}
.progress-chunks {
font-size: 12.5px;
color: var(--text-muted);
}
.progress-bar-track {
height: 4px;
border-radius: 2px;
background: rgba(255, 255, 255, 0.06);
overflow: hidden;
}
.progress-bar-fill {
height: 100%;
border-radius: 2px;
transition: width 0.5s ease;
}
/* ── Two column ─────────────────────────────────────────── */
.two-col {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 20px;
}
.stage-card,
.side-card {
padding: 22px;
}
/* ── Pipeline stages ────────────────────────────────────── */
.stages {
display: flex;
flex-direction: column;
gap: 14px;
}
.stage-row {
display: flex;
align-items: center;
gap: 14px;
}
.stage-dot {
width: 18px;
height: 18px;
border-radius: 9px;
display: flex;
align-items: center;
justify-content: center;
flex-shrink: 0;
}
.stage-dot-inner {
width: 6px;
height: 6px;
border-radius: 3px;
}
.stage-label {
flex: 1;
font-size: 13.5px;
color: rgba(232, 233, 236, 0.85);
}
/* ── Live preview / downloads ───────────────────────────── */
.live-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 14px;
}
.streaming-badge {
display: flex;
align-items: center;
gap: 6px;
font-size: 11px;
font-family: var(--font-mono);
}
.stream-dot {
width: 6px;
height: 6px;
border-radius: 3px;
}
.live-text {
font-size: 13.5px;
line-height: 1.7;
color: rgba(232, 233, 236, 0.85);
}
.dl-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 8px;
}
.dl-btn {
padding: 11px;
border-radius: 9px;
border: 1px solid var(--border);
background: rgba(255, 255, 255, 0.025);
color: rgba(232, 233, 236, 0.85);
font-size: 11.5px;
font-weight: 600;
letter-spacing: 0.04em;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
gap: 6px;
transition: background 0.15s;
}
.dl-btn:hover {
background: rgba(255, 255, 255, 0.04);
}
.output-dir {
margin-top: 14px;
padding-top: 14px;
border-top: 1px solid var(--border);
font-size: 11px;
color: var(--text-dim);
word-break: break-all;
}
/* ── Transcript ─────────────────────────────────────────── */
.transcript-card {
padding: 22px;
}
.transcript-header {
display: flex;
justify-content: space-between;
align-items: baseline;
margin-bottom: 16px;
}
.transcript-body {
max-height: 480px;
overflow-y: auto;
display: flex;
flex-direction: column;
gap: 16px;
padding-right: 4px;
}
.seg-row {
display: flex;
gap: 14px;
}
.seg-ts {
font-size: 11px;
color: var(--text-dim);
flex-shrink: 0;
margin-top: 3px;
width: 50px;
text-align: right;
}
.seg-text {
margin: 0;
font-size: 14px;
line-height: 1.65;
color: rgba(232, 233, 236, 0.85);
}
/* ── Responsive ─────────────────────────────────────────── */
@media (max-width: 768px) {
.page {
padding: 20px 16px;
}
.job-title {
font-size: 20px;
}
.two-col {
grid-template-columns: 1fr;
}
.dl-grid {
grid-template-columns: repeat(4, 1fr);
}
}
</style>

View File

@@ -0,0 +1,100 @@
<script lang="ts">
import { accent, ACCENT_OPTIONS } from '$lib/accent.js';
</script>
<svelte:head>
<title>Settings — Tonemark</title>
</svelte:head>
<div class="page">
<h1 class="page-title">Settings</h1>
<div class="glass section">
<div class="label" style="margin-bottom: 20px;">Accent colour</div>
<div class="swatch-row">
{#each ACCENT_OPTIONS as opt}
<button
class="swatch"
class:active={$accent.value === opt.value}
style="--c: {opt.value}"
onclick={() => accent.set(opt)}
title={opt.label}
aria-label={opt.label}
aria-pressed={$accent.value === opt.value}
>
{#if $accent.value === opt.value}
<svg width="12" height="12" viewBox="0 0 12 12" fill="none">
<path d="M2 6l3 3 5-5" stroke="#000" stroke-width="1.8" stroke-linecap="round" stroke-linejoin="round"/>
</svg>
{/if}
</button>
<span class="swatch-label">{opt.label}</span>
{/each}
</div>
</div>
</div>
<style>
.page {
padding: 32px 40px;
display: flex;
flex-direction: column;
gap: 20px;
max-width: 720px;
}
.page-title {
font-size: 28px;
font-weight: 600;
letter-spacing: -0.02em;
margin: 0;
}
.section {
padding: 28px;
}
.swatch-row {
display: flex;
align-items: center;
gap: 10px;
flex-wrap: wrap;
}
.swatch {
width: 36px;
height: 36px;
border-radius: 50%;
background: var(--c);
border: 2px solid transparent;
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
transition: transform 0.15s, border-color 0.15s;
outline: none;
}
.swatch:hover {
transform: scale(1.12);
}
.swatch.active {
border-color: rgba(255,255,255,0.5);
transform: scale(1.1);
}
.swatch-label {
font-size: 12px;
color: var(--text-muted);
margin-right: 6px;
font-family: var(--font-mono);
}
@media (max-width: 768px) {
.page {
padding: 20px 16px;
}
}
</style>

View File

@@ -0,0 +1,35 @@
import { redirect, error } from '@sveltejs/kit';
import { startYouTubeJob, startUploadJob } from '$lib/server/pipeline.js';
import type { AudioMode } from '$lib/types.js';
// Web Share Target: GET with ?url=... (from text/url share)
export async function GET({ url }) {
const shareUrl = url.searchParams.get('url') ?? url.searchParams.get('text');
if (!shareUrl) redirect(302, '/');
const jobId = await startYouTubeJob(shareUrl, 'auto');
redirect(302, `/jobs/${jobId}`);
}
// Web Share Target: POST multipart (from file share or URL share via form)
export async function POST({ request }) {
const form = await request.formData();
const shareUrl =
form.get('url')?.toString() ?? form.get('text')?.toString();
const file = form.get('media') ?? form.get('file');
const audioMode: AudioMode = 'auto';
if (shareUrl) {
const jobId = await startYouTubeJob(shareUrl, audioMode);
redirect(302, `/jobs/${jobId}`);
}
if (file instanceof File) {
const buffer = Buffer.from(await file.arrayBuffer());
const jobId = await startUploadJob(buffer, file.name, audioMode);
redirect(302, `/jobs/${jobId}`);
}
throw error(400, 'No URL or file provided');
}

95
src/service-worker.ts Normal file
View File

@@ -0,0 +1,95 @@
/// <reference no-default-lib="true"/>
/// <reference lib="esnext" />
/// <reference lib="webworker" />
/// <reference types="@sveltejs/kit" />
import { build, files, version } from '$service-worker';
declare const self: ServiceWorkerGlobalScope;
const CACHE = `tonemark-${version}`;
const ASSETS = [...build, ...files];
self.addEventListener('install', (event) => {
event.waitUntil(
caches.open(CACHE).then((cache) => cache.addAll(ASSETS))
);
self.skipWaiting();
});
self.addEventListener('activate', (event) => {
event.waitUntil(
caches.keys().then((keys) =>
Promise.all(keys.filter((k) => k !== CACHE).map((k) => caches.delete(k)))
)
);
self.clients.claim();
});
self.addEventListener('fetch', (event) => {
if (event.request.method !== 'GET') return;
event.respondWith(
(async () => {
const url = new URL(event.request.url);
const cache = await caches.open(CACHE);
// Serve cached assets instantly
if (ASSETS.includes(url.pathname)) {
const cached = await cache.match(url.pathname);
if (cached) return cached;
}
// Network-first for everything else
try {
const response = await fetch(event.request);
if (response instanceof Response && response.status === 200) {
cache.put(event.request, response.clone());
}
return response;
} catch {
const cached = await cache.match(event.request);
if (cached) return cached;
throw new Error('Offline and no cached response');
}
})()
);
});
// ── Push notifications ──────────────────────────────────────────────────────
self.addEventListener('push', (event) => {
const data = (event.data?.json() as { jobId?: string; title?: string; body?: string } | null) ?? {};
const title = data.title ?? 'Tonemark';
const body = data.body ?? 'Your transcript is ready.';
const jobId = data.jobId;
event.waitUntil(
self.registration.showNotification(title, {
body,
icon: '/icons/android-icon-192.png',
badge: '/icons/android-icon-192.png',
tag: jobId ? `job-${jobId}` : 'tonemark',
data: { jobId }
})
);
});
self.addEventListener('notificationclick', (event) => {
event.notification.close();
const jobId = (event.notification.data as { jobId?: string } | null)?.jobId;
const url = jobId ? `/jobs/${jobId}` : '/';
event.waitUntil(
self.clients
.matchAll({ type: 'window', includeUncontrolled: true })
.then((clients) => {
const target = clients.find((c) =>
jobId ? c.url.includes(`/jobs/${jobId}`) : c.url === self.location.origin + '/'
);
if (target) return target.focus();
return self.clients.openWindow(url);
})
);
});

260
src/tests/audio.test.ts Normal file
View File

@@ -0,0 +1,260 @@
import { describe, it, expect, vi, afterEach } from 'vitest';
// ── Build execFileMock with proper {stdout,stderr} promisify support ───────────
// util.promisify(execFile) normally returns {stdout,stderr} via a custom symbol.
// vi.fn() doesn't carry that symbol, so we add it here via vi.hoisted so it is
// in place before audio.ts loads and calls promisify(execFile) at module level.
const execFileMock = vi.hoisted(() => {
const fn = vi.fn();
Object.defineProperty(fn, Symbol.for('nodejs.util.promisify.custom'), {
configurable: true,
value: (...args: unknown[]) =>
new Promise<{ stdout: string; stderr: string }>((resolve, reject) => {
(fn as ReturnType<typeof vi.fn>)(
...args,
(err: Error | null, stdout: string, stderr: string) => {
if (err) reject(err);
else resolve({ stdout, stderr });
}
);
})
});
return fn;
});
vi.mock('child_process', () => ({ execFile: execFileMock }));
vi.mock('fs/promises', async (importOriginal) => {
const actual = await importOriginal<typeof import('fs/promises')>();
return { ...actual, mkdir: vi.fn().mockResolvedValue(undefined), unlink: vi.fn() };
});
import { buildFilterChain, analyzeVolume, prepareAudio } from '$lib/server/audio.js';
// Helper: make execFileMock call its callback with given stdout/stderr
function mockExecFile(stdout: string, stderr: string, err: Error | null = null) {
execFileMock.mockImplementation(
(_cmd: string, _args: string[], callback: (e: Error | null, out: string, err: string) => void) => {
callback(err, stdout, stderr);
}
);
}
afterEach(() => {
vi.clearAllMocks();
});
// ── buildFilterChain (pure — no mocking needed) ───────────────────────────────
describe('buildFilterChain', () => {
describe('mode = none', () => {
it('returns null regardless of volume', () => {
expect(buildFilterChain('none', -10)).toBeNull();
expect(buildFilterChain('none', -50)).toBeNull();
});
});
describe('mode = standard', () => {
it('returns standard chain regardless of volume', () => {
const chain = buildFilterChain('standard', -10);
expect(chain).toBe('highpass=f=80,lowpass=f=8000,loudnorm=I=-16:LRA=11:TP=-1.5');
});
it('does not add volume boost even for quiet audio', () => {
const chain = buildFilterChain('standard', -50);
expect(chain).not.toContain('volume=');
expect(chain).not.toContain('dynaudnorm');
});
});
describe('mode = auto', () => {
it('returns lightweight chain for normal audio (mean > -30 dB)', () => {
const chain = buildFilterChain('auto', -20);
expect(chain).toBe('highpass=f=80,lowpass=f=8000,loudnorm=I=-16:LRA=11:TP=-1.5');
});
it('returns boost chain for quiet audio (mean < -30 dB)', () => {
const chain = buildFilterChain('auto', -35);
expect(chain).toContain('volume=24dB');
expect(chain).toContain('dynaudnorm');
expect(chain).toContain('afftdn');
});
it('uses -30 dB as the quiet threshold', () => {
expect(buildFilterChain('auto', -30)).not.toContain('volume=');
expect(buildFilterChain('auto', -30.1)).toContain('volume=24dB');
});
it('always includes highpass and lowpass in both paths', () => {
expect(buildFilterChain('auto', -20)).toContain('highpass=f=80');
expect(buildFilterChain('auto', -35)).toContain('highpass=f=80');
expect(buildFilterChain('auto', -20)).toContain('lowpass=f=8000');
expect(buildFilterChain('auto', -35)).toContain('lowpass=f=8000');
});
it('always includes loudnorm targeting EBU R128 -16 LUFS', () => {
expect(buildFilterChain('auto', -20)).toContain('loudnorm=I=-16');
expect(buildFilterChain('auto', -35)).toContain('loudnorm=I=-16');
});
});
describe('mode = aggressive', () => {
it('includes noise reduction (afftdn) and gate regardless of volume', () => {
const chain = buildFilterChain('aggressive', -20)!;
expect(chain).toContain('afftdn=nf=-30');
expect(chain).toContain('agate=');
});
it('adds volume boost for quiet audio', () => {
const chain = buildFilterChain('aggressive', -35)!;
expect(chain).toContain('volume=24dB');
expect(chain).toContain('dynaudnorm');
});
it('omits volume boost for normal-level audio', () => {
const chain = buildFilterChain('aggressive', -20)!;
expect(chain).not.toContain('volume=24dB');
});
it('always includes loudnorm', () => {
expect(buildFilterChain('aggressive', -20)).toContain('loudnorm=I=-16');
});
});
});
// ── analyzeVolume ─────────────────────────────────────────────────────────────
describe('analyzeVolume', () => {
it('parses mean_volume and max_volume from ffmpeg stderr', async () => {
mockExecFile(
'',
['[Parsed_volumedetect_0] mean_volume: -20.1 dB', '[Parsed_volumedetect_0] max_volume: -1.4 dB'].join(
'\n'
)
);
const result = await analyzeVolume('/fake/audio.wav');
expect(result.meanVolume).toBe(-20.1);
expect(result.maxVolume).toBe(-1.4);
});
it('returns -99 for both when ffmpeg produces no volume lines', async () => {
mockExecFile('', 'some other ffmpeg output with no volume info');
const result = await analyzeVolume('/fake/audio.wav');
expect(result.meanVolume).toBe(-99);
expect(result.maxVolume).toBe(-99);
});
it('handles negative volume values (normal speech levels)', async () => {
mockExecFile('', 'mean_volume: -33.7 dB\nmax_volume: -0.3 dB');
const result = await analyzeVolume('/fake/audio.wav');
expect(result.meanVolume).toBe(-33.7);
expect(result.maxVolume).toBe(-0.3);
});
it('calls ffmpeg with volumedetect filter', async () => {
mockExecFile('', 'mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await analyzeVolume('/test/path.m4a');
const [cmd, args] = execFileMock.mock.calls[0];
expect(cmd).toBe('ffmpeg');
expect(args).toContain('volumedetect');
expect(args).toContain('/test/path.m4a');
});
});
// ── prepareAudio ffmpeg argument verification ─────────────────────────────────
describe('prepareAudio — ffmpeg arguments', () => {
// Calls in order: 1) volumedetect 2) silencedetect 3) conversion
function setupExecFileMock(volumeStderr: string) {
let callIndex = 0;
execFileMock.mockImplementation((_cmd: string, _args: string[], callback: Function) => {
callIndex++;
if (callIndex === 1) callback(null, '', volumeStderr);
else if (callIndex === 2) callback(null, '', ''); // silencedetect — no leading silence
else callback(null, '', ''); // conversion
});
}
it('always outputs 16 kHz sample rate (-ar 16000)', async () => {
setupExecFileMock('mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await prepareAudio('/input.m4a', 'job-1', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).toContain('-ar');
expect(conversionArgs[conversionArgs.indexOf('-ar') + 1]).toBe('16000');
});
it('always outputs mono (-ac 1)', async () => {
setupExecFileMock('mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await prepareAudio('/input.m4a', 'job-2', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).toContain('-ac');
expect(conversionArgs[conversionArgs.indexOf('-ac') + 1]).toBe('1');
});
it('always encodes as pcm_s16le WAV', async () => {
setupExecFileMock('mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await prepareAudio('/input.m4a', 'job-3', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).toContain('-c:a');
expect(conversionArgs[conversionArgs.indexOf('-c:a') + 1]).toBe('pcm_s16le');
});
it('applies -af filter chain for non-none modes', async () => {
setupExecFileMock('mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await prepareAudio('/input.m4a', 'job-4', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).toContain('-af');
});
it('omits -af entirely for mode=none', async () => {
setupExecFileMock('mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
await prepareAudio('/input.m4a', 'job-5', 'none');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).not.toContain('-af');
});
it('prepends -ss when leading silence is detected at the start', async () => {
let callIndex = 0;
execFileMock.mockImplementation((_cmd: string, _args: string[], callback: Function) => {
callIndex++;
if (callIndex === 1) callback(null, '', 'mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
else if (callIndex === 2)
callback(null, '', 'silence_start: 0\nsilence_end: 2.000 | silence_duration: 2.000');
else callback(null, '', '');
});
await prepareAudio('/input.m4a', 'job-6', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).toContain('-ss');
expect(conversionArgs[conversionArgs.indexOf('-ss') + 1]).toBe('2.000');
});
it('omits -ss when silence starts well into the file (not leading)', async () => {
let callIndex = 0;
execFileMock.mockImplementation((_cmd: string, _args: string[], callback: Function) => {
callIndex++;
if (callIndex === 1) callback(null, '', 'mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
else if (callIndex === 2)
callback(null, '', 'silence_start: 281\nsilence_end: 283 | silence_duration: 2.0');
else callback(null, '', '');
});
await prepareAudio('/input.m4a', 'job-7', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
expect(conversionArgs).not.toContain('-ss');
});
it('caps leading silence trim at 30 seconds', async () => {
let callIndex = 0;
execFileMock.mockImplementation((_cmd: string, _args: string[], callback: Function) => {
callIndex++;
if (callIndex === 1) callback(null, '', 'mean_volume: -20.0 dB\nmax_volume: -1.0 dB');
else if (callIndex === 2)
callback(null, '', 'silence_start: 0\nsilence_end: 45.000 | silence_duration: 45.000');
else callback(null, '', '');
});
await prepareAudio('/input.m4a', 'job-8', 'standard');
const conversionArgs: string[] = execFileMock.mock.calls.at(-1)![1];
const ssIdx = conversionArgs.indexOf('-ss');
expect(ssIdx).toBeGreaterThan(-1);
expect(parseFloat(conversionArgs[ssIdx + 1])).toBeLessThanOrEqual(30);
});
});

195
src/tests/db.test.ts Normal file
View File

@@ -0,0 +1,195 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { vi } from 'vitest';
import { mkdtemp, rm } from 'fs/promises';
import { join } from 'path';
import { tmpdir } from 'os';
// Set DATA_DIR before db module is imported (module-level DB init)
const TEST_DATA_DIR = join(tmpdir(), 'whisper-pwa-db-test-' + Date.now());
vi.stubEnv('DATA_DIR', TEST_DATA_DIR);
import {
createJob,
getJob,
listJobs,
updateJob,
setJobStatus,
savePushSubscription,
getAllSubscriptions,
deletePushSubscription
} from '$lib/server/db.js';
afterAll(async () => {
await rm(TEST_DATA_DIR, { recursive: true, force: true });
});
// ── createJob / getJob ────────────────────────────────────────────────────────
describe('createJob', () => {
it('creates a job and returns it with a UUID', () => {
const job = createJob('https://youtu.be/abc', 'My Video', 'auto');
expect(job.id).toMatch(/^[0-9a-f-]{36}$/);
expect(job.source).toBe('https://youtu.be/abc');
expect(job.title).toBe('My Video');
expect(job.audioMode).toBe('auto');
});
it('defaults status to pending', () => {
const job = createJob('src', 'title', 'standard');
expect(job.status).toBe('pending');
});
it('defaults progress to 0', () => {
const job = createJob('src', 'title', 'none');
expect(job.progress).toBe(0);
});
it('assigns unique IDs to each job', () => {
const a = createJob('src', 'A', 'auto');
const b = createJob('src', 'B', 'auto');
expect(a.id).not.toBe(b.id);
});
});
describe('getJob', () => {
it('returns null for an unknown id', () => {
expect(getJob('00000000-0000-0000-0000-000000000000')).toBeNull();
});
it('returns the correct job by id', () => {
const created = createJob('https://youtu.be/xyz', 'Find Me', 'aggressive');
const found = getJob(created.id);
expect(found).not.toBeNull();
expect(found!.id).toBe(created.id);
expect(found!.title).toBe('Find Me');
});
});
// ── updateJob ─────────────────────────────────────────────────────────────────
describe('updateJob', () => {
it('updates only specified fields, preserving others', () => {
const job = createJob('src', 'Original Title', 'auto');
updateJob({ id: job.id, title: 'New Title' });
const updated = getJob(job.id)!;
expect(updated.title).toBe('New Title');
expect(updated.source).toBe('src'); // unchanged
expect(updated.audioMode).toBe('auto'); // unchanged
});
it('stores whisperJobId', () => {
const job = createJob('src', 'title', 'auto');
updateJob({ id: job.id, whisperJobId: 'whisper-uuid-123' });
expect(getJob(job.id)!.whisperJobId).toBe('whisper-uuid-123');
});
it('stores meanVolume', () => {
const job = createJob('src', 'title', 'auto');
updateJob({ id: job.id, meanVolume: -20.5 });
expect(getJob(job.id)!.meanVolume).toBe(-20.5);
});
it('stores segmentsJson', () => {
const job = createJob('src', 'title', 'auto');
const json = JSON.stringify([{ index: 0, start: 0, end: 5, text: 'Hello', words: [] }]);
updateJob({ id: job.id, segmentsJson: json });
expect(getJob(job.id)!.segmentsJson).toBe(json);
});
it('stores error message', () => {
const job = createJob('src', 'title', 'auto');
updateJob({ id: job.id, status: 'failed', error: 'something went wrong' });
const updated = getJob(job.id)!;
expect(updated.status).toBe('failed');
expect(updated.error).toBe('something went wrong');
});
it('does nothing for an unknown id', () => {
expect(() => updateJob({ id: 'no-such-id', title: 'Ghost' })).not.toThrow();
});
});
// ── setJobStatus ──────────────────────────────────────────────────────────────
describe('setJobStatus', () => {
it('updates status without touching other fields', () => {
const job = createJob('src', 'Status Test', 'auto');
setJobStatus(job.id, 'downloading', 5);
const updated = getJob(job.id)!;
expect(updated.status).toBe('downloading');
expect(updated.progress).toBe(5);
expect(updated.title).toBe('Status Test'); // unchanged
});
it('transitions through all valid statuses', () => {
const job = createJob('src', 'title', 'auto');
const statuses = ['downloading', 'preparing', 'transcribing', 'processing', 'done'] as const;
for (const status of statuses) {
setJobStatus(job.id, status, 50);
expect(getJob(job.id)!.status).toBe(status);
}
});
it('defaults progress to 0 if not supplied', () => {
const job = createJob('src', 'title', 'auto');
setJobStatus(job.id, 'preparing');
expect(getJob(job.id)!.progress).toBe(0);
});
});
// ── listJobs ──────────────────────────────────────────────────────────────────
describe('listJobs', () => {
it('returns jobs in descending creation order', () => {
const a = createJob('src', 'Alpha', 'auto');
const b = createJob('src', 'Beta', 'auto');
const jobs = listJobs();
const ids = jobs.map((j) => j.id);
expect(ids.indexOf(b.id)).toBeLessThan(ids.indexOf(a.id));
});
it('returns an array (possibly empty)', () => {
expect(Array.isArray(listJobs())).toBe(true);
});
});
// ── Push subscriptions ────────────────────────────────────────────────────────
describe('push subscriptions', () => {
const sub1 = { endpoint: 'https://push.example.com/abc', p256dh: 'p256dh-value-1', auth: 'auth-1' };
const sub2 = { endpoint: 'https://push.example.com/xyz', p256dh: 'p256dh-value-2', auth: 'auth-2' };
it('saves and retrieves a subscription', () => {
savePushSubscription(sub1);
const subs = getAllSubscriptions();
const found = subs.find((s) => s.endpoint === sub1.endpoint);
expect(found).toBeDefined();
expect(found!.p256dh).toBe(sub1.p256dh);
expect(found!.auth).toBe(sub1.auth);
});
it('upserts on duplicate endpoint (updates keys)', () => {
savePushSubscription(sub1);
const updated = { ...sub1, p256dh: 'new-p256dh', auth: 'new-auth' };
savePushSubscription(updated);
const subs = getAllSubscriptions().filter((s) => s.endpoint === sub1.endpoint);
expect(subs).toHaveLength(1);
expect(subs[0].p256dh).toBe('new-p256dh');
});
it('deletes by endpoint', () => {
savePushSubscription(sub2);
deletePushSubscription(sub2.endpoint);
const found = getAllSubscriptions().find((s) => s.endpoint === sub2.endpoint);
expect(found).toBeUndefined();
});
it('stores multiple subscriptions independently', () => {
savePushSubscription(sub1);
savePushSubscription(sub2);
const subs = getAllSubscriptions();
const endpoints = subs.map((s) => s.endpoint);
expect(endpoints).toContain(sub1.endpoint);
expect(endpoints).toContain(sub2.endpoint);
});
});

195
src/tests/formatter.test.ts Normal file
View File

@@ -0,0 +1,195 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { mkdtemp, rm } from 'fs/promises';
import { readFile, readdir } from 'fs/promises';
import { join } from 'path';
import { tmpdir } from 'os';
import { buildSrt, buildTxt, buildMd, buildJson, writeOutputs } from '$lib/server/formatter.js';
import type { Segment } from '$lib/types.js';
// ── helpers ──────────────────────────────────────────────────────────────────
function seg(index: number, start: number, end: number, text: string): Segment {
return { index, start, end, text, words: [] };
}
const SAMPLE_SEGS: Segment[] = [
seg(0, 0, 4.5, ' Hello and welcome to the show.'),
seg(1, 4.5, 10, " Today we're going to discuss transcription."),
seg(2, 310, 315, ' This is after five minutes.'),
seg(3, 620, 625, ' And this is after ten minutes.')
];
let tmpDir: string;
beforeAll(async () => {
tmpDir = await mkdtemp(join(tmpdir(), 'whisper-fmt-test-'));
process.env.OUTPUT_DIR = tmpDir;
});
afterAll(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── secToSrtTime (via buildSrt) ───────────────────────────────────────────────
describe('buildSrt — timestamp formatting', () => {
it('formats zero seconds as 00:00:00,000', () => {
const srt = buildSrt([seg(0, 0, 1, ' test')]);
expect(srt).toContain('00:00:00,000 --> 00:00:01,000');
});
it('formats fractional seconds with milliseconds', () => {
const srt = buildSrt([seg(0, 4.5, 10.123, ' test')]);
expect(srt).toContain('00:00:04,500 --> 00:00:10,123');
});
it('formats hours correctly', () => {
const srt = buildSrt([seg(0, 3661.5, 3662, ' test')]);
expect(srt).toContain('01:01:01,500 --> 01:01:02,000');
});
it('returns empty string for empty segment list', () => {
expect(buildSrt([])).toBe('');
});
});
describe('buildSrt — structure', () => {
it('numbers entries starting from 1', () => {
const srt = buildSrt(SAMPLE_SEGS);
const lines = srt.split('\n\n');
expect(lines[0]).toMatch(/^1\n/);
expect(lines[1]).toMatch(/^2\n/);
});
it('includes segment text trimmed', () => {
const srt = buildSrt([seg(0, 0, 1, ' hello ')]);
expect(srt).toContain('\nhello');
});
it('separates entries with blank lines', () => {
const srt = buildSrt(SAMPLE_SEGS);
expect(srt).toContain('\n\n');
});
});
// ── buildTxt ─────────────────────────────────────────────────────────────────
describe('buildTxt', () => {
it('returns empty string for empty input', () => {
expect(buildTxt([])).toBe('');
});
it('contains no timestamp-like strings (hh:mm:ss)', () => {
const txt = buildTxt(SAMPLE_SEGS);
expect(txt).not.toMatch(/\d{2}:\d{2}:\d{2}/);
});
it('includes all segment texts', () => {
const txt = buildTxt(SAMPLE_SEGS);
expect(txt).toContain('Hello and welcome');
expect(txt).toContain('transcription');
});
it('creates paragraph breaks at sentence endings when para is long enough', () => {
// Build segments that together form >200 chars ending with '.'
const longSegs: Segment[] = [];
let text = '';
for (let i = 0; i < 15; i++) {
const t = ' This is sentence number ' + i + ' for our test.';
text += t;
longSegs.push(seg(i, i * 2, i * 2 + 2, t));
}
const txt = buildTxt(longSegs);
// Should have at least one paragraph break
expect(txt).toContain('\n\n');
});
it('skips empty segments', () => {
const segs = [seg(0, 0, 1, ' '), seg(1, 1, 2, ' Hello.')];
const txt = buildTxt(segs);
expect(txt).toBe('Hello.');
});
});
// ── buildMd ───────────────────────────────────────────────────────────────────
describe('buildMd', () => {
it('starts with an H1 title', () => {
const md = buildMd(SAMPLE_SEGS, 'My Title');
expect(md).toMatch(/^# My Title/);
});
it('adds H2 timestamp headings roughly every 5 minutes', () => {
const md = buildMd(SAMPLE_SEGS, 'Test');
// First heading at 0s (00:00:00), second at 310s (00:05:10), third at 620s (00:10:20)
const h2s = [...md.matchAll(/^## /gm)];
expect(h2s.length).toBeGreaterThanOrEqual(3);
});
it('does not produce a heading for every segment', () => {
// 4 segments but only 3 should get headings (every 5min)
const md = buildMd(SAMPLE_SEGS, 'Test');
const h2s = [...md.matchAll(/^## /gm)];
expect(h2s.length).toBeLessThan(SAMPLE_SEGS.length);
});
it('includes all segment texts', () => {
const md = buildMd(SAMPLE_SEGS, 'Test');
expect(md).toContain('Hello and welcome');
expect(md).toContain('after ten minutes');
});
});
// ── buildJson ─────────────────────────────────────────────────────────────────
describe('buildJson', () => {
it('produces valid JSON', () => {
const j = buildJson(SAMPLE_SEGS, 'My Transcript');
expect(() => JSON.parse(j)).not.toThrow();
});
it('includes title and segments array', () => {
const parsed = JSON.parse(buildJson(SAMPLE_SEGS, 'My Transcript'));
expect(parsed.title).toBe('My Transcript');
expect(Array.isArray(parsed.segments)).toBe(true);
expect(parsed.segments).toHaveLength(SAMPLE_SEGS.length);
});
});
// ── writeOutputs ──────────────────────────────────────────────────────────────
describe('writeOutputs', () => {
it('creates the output directory', async () => {
await writeOutputs(SAMPLE_SEGS, 'Create Dir Test', 'job-001');
const entries = await readdir(tmpDir);
expect(entries).toContain('Create_Dir_Test');
});
it('writes .srt, .txt, .md, and .json files', async () => {
const paths = await writeOutputs(SAMPLE_SEGS, 'All Files Test', 'job-002');
const srt = await readFile(paths.srt, 'utf8');
const txt = await readFile(paths.txt, 'utf8');
const md = await readFile(paths.md, 'utf8');
const json = await readFile(paths.json, 'utf8');
expect(srt).toContain('-->');
expect(txt).not.toContain('-->');
expect(md).toContain('# All Files Test');
expect(JSON.parse(json).title).toBe('All Files Test');
});
it('sanitises the title for use as a directory/filename', async () => {
const paths = await writeOutputs(SAMPLE_SEGS, 'Test: Special/Chars!', 'job-003');
// Check only the filename, not the directory path which always contains '/'
const filename = paths.srt.split('/').pop()!;
expect(filename).not.toMatch(/[:/!]/);
});
it('writes empty outputs when given no segments', async () => {
const paths = await writeOutputs([], 'Empty Segments', 'job-004');
const srt = await readFile(paths.srt, 'utf8');
const txt = await readFile(paths.txt, 'utf8');
expect(srt).toBe('');
expect(txt).toBe('');
});
});

View File

@@ -0,0 +1,127 @@
import { describe, it, expect } from 'vitest';
import {
deduplicateSegments
} from '$lib/server/postprocess.js';
import type { Segment } from '$lib/types.js';
// ── helpers ──────────────────────────────────────────────────────────────────
function seg(index: number, start: number, end: number, text: string): Segment {
return { index, start, end, text, words: [] };
}
// ── collapseRepeats (tested indirectly via deduplicateSegments) ───────────────
describe('deduplicateSegments — collapseRepeats', () => {
it('leaves text without repetition unchanged', () => {
const input = [seg(0, 0, 5, ' Hello world, this is a sentence.')];
const [out] = deduplicateSegments(input);
expect(out.text).toBe('Hello world, this is a sentence.');
});
it('collapses a consecutive repeated phrase inside a segment', () => {
const input = [seg(0, 0, 5, ' the quick brown fox the quick brown fox')];
const [out] = deduplicateSegments(input);
expect(out.text).not.toMatch(/the quick brown fox.*the quick brown fox/i);
});
it('handles multiple repetitions recursively', () => {
// "welcome everyone" = 16 chars — qualifies for the ≥10-char collapse regex
const input = [seg(0, 0, 5, ' welcome everyone welcome everyone welcome everyone')];
const result = deduplicateSegments(input);
const text = result[0]?.text ?? '';
expect((text.match(/welcome everyone/gi) ?? []).length).toBeLessThan(3);
});
});
// ── mergeConsecutive ──────────────────────────────────────────────────────────
describe('deduplicateSegments — mergeConsecutive', () => {
it('merges adjacent segments with identical text', () => {
const input = [
seg(0, 0, 2, ' Hello world.'),
seg(1, 2, 4, ' Hello world.')
];
const result = deduplicateSegments(input);
expect(result).toHaveLength(1);
expect(result[0].end).toBe(4);
});
it('keeps adjacent segments with different text', () => {
const input = [
seg(0, 0, 2, ' First sentence.'),
seg(1, 2, 4, ' Second sentence.')
];
const result = deduplicateSegments(input);
expect(result).toHaveLength(2);
});
it('normalises punctuation and case for merge comparison', () => {
const input = [
seg(0, 0, 2, ' Hello, World!'),
seg(1, 2, 4, ' hello world')
];
const result = deduplicateSegments(input);
expect(result).toHaveLength(1);
});
});
// ── ngramDedup ────────────────────────────────────────────────────────────────
describe('deduplicateSegments — ngramDedup', () => {
it('passes through completely unique segments', () => {
const input = [
seg(0, 0, 5, ' The cat sat on the mat quite happily today.'),
seg(1, 5, 10, ' Later the dog ran across the yard chasing a ball.')
];
expect(deduplicateSegments(input)).toHaveLength(2);
});
it('removes a segment that is highly similar to recent context', () => {
// Repeat a long sentence verbatim — should be caught as duplicate
const longText =
' This is a very specific and unique sentence about transcription quality matters greatly.';
const input = [seg(0, 0, 5, longText), seg(1, 5, 10, longText)];
// After mergeConsecutive the second one is already merged, so result is 1
expect(deduplicateSegments(input)).toHaveLength(1);
});
});
// ── deduplicateSegments — full pipeline ──────────────────────────────────────
describe('deduplicateSegments — full pipeline', () => {
it('returns empty array for empty input', () => {
expect(deduplicateSegments([])).toEqual([]);
});
it('removes segments whose text is empty after trimming', () => {
const input = [seg(0, 0, 1, ' '), seg(1, 1, 2, ' Hello.')];
const result = deduplicateSegments(input);
expect(result).toHaveLength(1);
expect(result[0].text).toBe('Hello.');
});
it('re-indexes output segments starting from 0', () => {
const input = [
seg(5, 0, 2, ' First unique sentence here.'),
seg(8, 2, 4, ' Second different sentence there.')
];
const result = deduplicateSegments(input);
result.forEach((s, i) => expect(s.index).toBe(i));
});
it('runs the full pipeline: trim → remove empty → merge → ngram → merge → reindex', () => {
const input = [
seg(0, 0, 2, ' Good morning everyone.'),
seg(1, 2, 3, ' '), // empty — removed
seg(2, 3, 5, ' Good morning everyone.'), // duplicate — merged
seg(3, 5, 7, ' Welcome to our presentation today.')
];
const result = deduplicateSegments(input);
expect(result).toHaveLength(2);
expect(result[0].text).toBe('Good morning everyone.');
expect(result[1].text).toBe('Welcome to our presentation today.');
expect(result[0].index).toBe(0);
expect(result[1].index).toBe(1);
});
});

139
src/tests/push.test.ts Normal file
View File

@@ -0,0 +1,139 @@
import { describe, it, expect, vi, afterEach } from 'vitest';
// ── Hoist mock functions so they're available inside vi.mock() factories ───────
const { mockSetVapidDetails, mockWebPushSend } = vi.hoisted(() => ({
mockSetVapidDetails: vi.fn(),
mockWebPushSend: vi.fn()
}));
// ── Set up DATA_DIR before db module is imported ──────────────────────────────
const TEST_DATA_DIR = `/tmp/whisper-push-test-${Date.now()}`;
vi.stubEnv('DATA_DIR', TEST_DATA_DIR);
vi.stubEnv('VAPID_PUBLIC_KEY', 'BFakePublicKeyForTesting1234567890ABCDEFGHIJKLMNOP=');
vi.stubEnv('VAPID_PRIVATE_KEY', 'FakePrivateKey1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ=');
// ── Mock web-push ─────────────────────────────────────────────────────────────
vi.mock('web-push', () => ({
default: {
setVapidDetails: mockSetVapidDetails,
sendNotification: mockWebPushSend
}
}));
import { sendNotification, getVapidPublicKey } from '$lib/server/push.js';
import { savePushSubscription, deletePushSubscription, getAllSubscriptions } from '$lib/server/db.js';
import { rm } from 'fs/promises';
afterEach(async () => {
mockSetVapidDetails.mockReset();
mockWebPushSend.mockReset();
// Remove all test subscriptions between tests
const subs = getAllSubscriptions();
for (const s of subs) deletePushSubscription(s.endpoint);
await rm(TEST_DATA_DIR, { recursive: true, force: true }).catch(() => {});
});
// ── getVapidPublicKey ─────────────────────────────────────────────────────────
describe('getVapidPublicKey', () => {
it('returns the VAPID_PUBLIC_KEY env var', () => {
expect(getVapidPublicKey()).toBe('BFakePublicKeyForTesting1234567890ABCDEFGHIJKLMNOP=');
});
it('returns null when VAPID_PUBLIC_KEY is not set', () => {
vi.stubEnv('VAPID_PUBLIC_KEY', '');
expect(getVapidPublicKey()).toBeNull();
vi.stubEnv('VAPID_PUBLIC_KEY', 'BFakePublicKeyForTesting1234567890ABCDEFGHIJKLMNOP=');
});
});
// ── sendNotification ──────────────────────────────────────────────────────────
describe('sendNotification', () => {
it('does nothing when there are no subscriptions', async () => {
await sendNotification('job-1', '✅ Transcript ready', 'My Video');
expect(mockWebPushSend).not.toHaveBeenCalled();
});
it('sends a push to each stored subscription', async () => {
savePushSubscription({ endpoint: 'https://fcm.example.com/push/a', p256dh: 'pk1', auth: 'auth1' });
savePushSubscription({ endpoint: 'https://fcm.example.com/push/b', p256dh: 'pk2', auth: 'auth2' });
mockWebPushSend.mockResolvedValue({});
await sendNotification('job-2', '✅ Done', 'My Video Title');
expect(mockWebPushSend).toHaveBeenCalledTimes(2);
});
it('sends payload containing jobId, title, and body', async () => {
savePushSubscription({ endpoint: 'https://fcm.example.com/push/c', p256dh: 'pk3', auth: 'auth3' });
mockWebPushSend.mockResolvedValue({});
await sendNotification('job-3', '✅ Transcript ready', 'The Video Title');
const [, payload] = mockWebPushSend.mock.calls[0];
const parsed = JSON.parse(payload);
expect(parsed.jobId).toBe('job-3');
expect(parsed.title).toBe('✅ Transcript ready');
expect(parsed.body).toBe('The Video Title');
});
it('sends to the correct push endpoint with keys', async () => {
const sub = { endpoint: 'https://fcm.example.com/push/d', p256dh: 'pk4', auth: 'auth4' };
savePushSubscription(sub);
mockWebPushSend.mockResolvedValue({});
await sendNotification('job-4', 'title', 'body');
const [pushSub] = mockWebPushSend.mock.calls[0];
expect(pushSub.endpoint).toBe(sub.endpoint);
expect(pushSub.keys.p256dh).toBe(sub.p256dh);
expect(pushSub.keys.auth).toBe(sub.auth);
});
it('removes a subscription that returns HTTP 410 Gone', async () => {
const endpoint = 'https://fcm.example.com/push/gone';
savePushSubscription({ endpoint, p256dh: 'pk', auth: 'auth' });
mockWebPushSend.mockRejectedValue({ statusCode: 410 });
await sendNotification('job-5', 'title', 'body');
// Subscription should be removed after 410
const remaining = getAllSubscriptions().find((s) => s.endpoint === endpoint);
expect(remaining).toBeUndefined();
});
it('removes a subscription that returns HTTP 404 Not Found', async () => {
const endpoint = 'https://fcm.example.com/push/notfound';
savePushSubscription({ endpoint, p256dh: 'pk', auth: 'auth' });
mockWebPushSend.mockRejectedValue({ statusCode: 404 });
await sendNotification('job-6', 'title', 'body');
const remaining = getAllSubscriptions().find((s) => s.endpoint === endpoint);
expect(remaining).toBeUndefined();
});
it('keeps subscriptions that fail with other errors', async () => {
const endpoint = 'https://fcm.example.com/push/transient';
savePushSubscription({ endpoint, p256dh: 'pk', auth: 'auth' });
mockWebPushSend.mockRejectedValue({ statusCode: 500 });
await sendNotification('job-7', 'title', 'body');
const remaining = getAllSubscriptions().find((s) => s.endpoint === endpoint);
expect(remaining).toBeDefined();
});
it('continues sending to other subscriptions if one fails', async () => {
savePushSubscription({ endpoint: 'https://fcm.example.com/push/ok1', p256dh: 'pk1', auth: 'a1' });
savePushSubscription({ endpoint: 'https://fcm.example.com/push/fail', p256dh: 'pk2', auth: 'a2' });
savePushSubscription({ endpoint: 'https://fcm.example.com/push/ok2', p256dh: 'pk3', auth: 'a3' });
mockWebPushSend
.mockResolvedValueOnce({})
.mockRejectedValueOnce({ statusCode: 500 })
.mockResolvedValueOnce({});
await sendNotification('job-8', 'title', 'body');
expect(mockWebPushSend).toHaveBeenCalledTimes(3);
});
});

300
src/tests/webhook.test.ts Normal file
View File

@@ -0,0 +1,300 @@
import { describe, it, expect, vi, beforeEach } from 'vitest';
import type { Segment, WhisperJob } from '$lib/types.js';
// ── Hoist all mock functions so they're available inside vi.mock() factories ──
const {
mockGetJob,
mockUpdateJob,
mockSetJobStatus,
mockDeduplicateSegments,
mockWriteOutputs,
mockSendNotification,
mockCleanupJobTmp,
mockEmitProgress
} = vi.hoisted(() => ({
mockGetJob: vi.fn(),
mockUpdateJob: vi.fn(),
mockSetJobStatus: vi.fn(),
mockDeduplicateSegments: vi.fn((segs: Segment[]) => segs),
mockWriteOutputs: vi.fn(),
mockSendNotification: vi.fn(),
mockCleanupJobTmp: vi.fn(),
mockEmitProgress: vi.fn()
}));
vi.mock('$lib/server/db.js', () => ({
getJob: mockGetJob,
updateJob: mockUpdateJob,
setJobStatus: mockSetJobStatus
}));
vi.mock('$lib/server/postprocess.js', () => ({
deduplicateSegments: mockDeduplicateSegments
}));
vi.mock('$lib/server/formatter.js', () => ({
writeOutputs: mockWriteOutputs
}));
vi.mock('$lib/server/push.js', () => ({
sendNotification: mockSendNotification
}));
vi.mock('$lib/server/downloader.js', () => ({
cleanupJobTmp: mockCleanupJobTmp
}));
vi.mock('$lib/server/pipeline.js', () => ({
emitProgress: mockEmitProgress
}));
// Import the handler AFTER mocks are in place
import { POST } from '$lib/../routes/api/webhook/[jobId]/+server.js';
// ── Test helpers ──────────────────────────────────────────────────────────────
function makeEvent(jobId: string, body: unknown) {
return {
params: { jobId },
request: new Request(`http://localhost/api/webhook/${jobId}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
})
};
}
function makeJob(id: string, title = 'Test Video') {
return { id, status: 'transcribing', title, source: 'url', audioMode: 'auto', progress: 10 };
}
function makeWhisperJob(overrides: Partial<WhisperJob> = {}): WhisperJob {
return {
id: 'whisper-id',
status: 'done',
language: 'en',
task: 'transcribe',
duration_secs: 60,
progress: 100,
segments: [],
error: null,
created_at: new Date().toISOString(),
completed_at: new Date().toISOString(),
...overrides
};
}
function makeSeg(index: number, text: string): Segment {
return { index, start: index * 5, end: index * 5 + 5, text, words: [] };
}
beforeEach(() => {
vi.clearAllMocks();
mockDeduplicateSegments.mockImplementation((segs: Segment[]) => segs);
mockWriteOutputs.mockResolvedValue({
srt: '/out/dir/title.srt',
txt: '/out/dir/title.txt',
md: '/out/dir/title.md',
json: '/out/dir/title.json'
});
mockSendNotification.mockResolvedValue(undefined);
mockCleanupJobTmp.mockResolvedValue(undefined);
});
// ── 404 for unknown job ───────────────────────────────────────────────────────
describe('POST /api/webhook/[jobId] — job not found', () => {
it('throws 404 when the job does not exist in the database', async () => {
mockGetJob.mockReturnValue(null);
await expect(POST(makeEvent('ghost-id', makeWhisperJob()) as any)).rejects.toMatchObject({
status: 404
});
});
});
// ── Whisper job failed / cancelled ───────────────────────────────────────────
describe('POST /api/webhook/[jobId] — whisper failure', () => {
it('marks the job as failed when whisper status is "failed"', async () => {
mockGetJob.mockReturnValue(makeJob('job-1'));
const payload = makeWhisperJob({ status: 'failed', error: 'GPU OOM', segments: [] });
const res = await POST(makeEvent('job-1', payload) as any);
expect(res.status).toBe(200);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ id: 'job-1', status: 'failed', error: 'GPU OOM' })
);
expect(mockWriteOutputs).not.toHaveBeenCalled();
expect(mockSendNotification).not.toHaveBeenCalled();
});
it('marks the job as failed when whisper status is "cancelled"', async () => {
mockGetJob.mockReturnValue(makeJob('job-2'));
const payload = makeWhisperJob({ status: 'cancelled', segments: [] });
await POST(makeEvent('job-2', payload) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ status: 'failed' })
);
});
it('uses the whisper error message when provided', async () => {
mockGetJob.mockReturnValue(makeJob('job-err'));
const payload = makeWhisperJob({ status: 'failed', error: 'CUDA device error', segments: [] });
await POST(makeEvent('job-err', payload) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ error: 'CUDA device error' })
);
});
it('emits an error progress event', async () => {
mockGetJob.mockReturnValue(makeJob('job-ev'));
await POST(makeEvent('job-ev', makeWhisperJob({ status: 'failed', segments: [] })) as any);
expect(mockEmitProgress).toHaveBeenCalledWith('job-ev', expect.objectContaining({ type: 'error' }));
});
});
// ── Successful transcription with segments ───────────────────────────────────
describe('POST /api/webhook/[jobId] — success with segments', () => {
const segments = [makeSeg(0, 'Hello world.'), makeSeg(1, 'This is a test.')];
it('runs deduplication on received segments', async () => {
mockGetJob.mockReturnValue(makeJob('job-3'));
await POST(makeEvent('job-3', makeWhisperJob({ segments })) as any);
expect(mockDeduplicateSegments).toHaveBeenCalledWith(segments);
});
it('calls writeOutputs with the deduplicated segments and job title', async () => {
mockGetJob.mockReturnValue(makeJob('job-4', 'My Lecture'));
const deduped = [makeSeg(0, 'Hello world.')];
mockDeduplicateSegments.mockReturnValue(deduped);
await POST(makeEvent('job-4', makeWhisperJob({ segments })) as any);
expect(mockWriteOutputs).toHaveBeenCalledWith(deduped, 'My Lecture', 'job-4');
});
it('stores serialised segments_json in the database', async () => {
mockGetJob.mockReturnValue(makeJob('job-5'));
const deduped = [makeSeg(0, 'Result text.')];
mockDeduplicateSegments.mockReturnValue(deduped);
await POST(makeEvent('job-5', makeWhisperJob({ segments })) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({
id: 'job-5',
status: 'done',
segmentsJson: JSON.stringify(deduped)
})
);
});
it('sets job status to done with progress 100', async () => {
mockGetJob.mockReturnValue(makeJob('job-6'));
await POST(makeEvent('job-6', makeWhisperJob({ segments })) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ status: 'done', progress: 100 })
);
});
it('sets outputDir from the paths returned by writeOutputs', async () => {
mockGetJob.mockReturnValue(makeJob('job-7'));
mockWriteOutputs.mockResolvedValue({
srt: '/home/user/transcripts/My_Title/My_Title.srt',
txt: '/home/user/transcripts/My_Title/My_Title.txt',
md: '/home/user/transcripts/My_Title/My_Title.md',
json: '/home/user/transcripts/My_Title/My_Title.json'
});
await POST(makeEvent('job-7', makeWhisperJob({ segments })) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ outputDir: '/home/user/transcripts/My_Title' })
);
});
it('sends a push notification with the job title', async () => {
mockGetJob.mockReturnValue(makeJob('job-8', 'Awesome Lecture'));
await POST(makeEvent('job-8', makeWhisperJob({ segments })) as any);
expect(mockSendNotification).toHaveBeenCalledWith(
'job-8',
'✅ Transcript ready',
'Awesome Lecture'
);
});
it('cleans up tmp files after completion', async () => {
mockGetJob.mockReturnValue(makeJob('job-9'));
await POST(makeEvent('job-9', makeWhisperJob({ segments })) as any);
expect(mockCleanupJobTmp).toHaveBeenCalledWith('job-9');
});
it('emits a done progress event', async () => {
mockGetJob.mockReturnValue(makeJob('job-10'));
await POST(makeEvent('job-10', makeWhisperJob({ segments })) as any);
expect(mockEmitProgress).toHaveBeenCalledWith('job-10', expect.objectContaining({ type: 'done' }));
});
it('returns { ok: true } with status 200', async () => {
mockGetJob.mockReturnValue(makeJob('job-11'));
const res = await POST(makeEvent('job-11', makeWhisperJob({ segments })) as any);
expect(res.status).toBe(200);
const body = await res.json();
expect(body).toEqual({ ok: true });
});
});
// ── Empty segments (whisper produced no output) ───────────────────────────────
describe('POST /api/webhook/[jobId] — empty segments', () => {
it('completes the job as done even with zero segments', async () => {
mockGetJob.mockReturnValue(makeJob('job-empty'));
await POST(makeEvent('job-empty', makeWhisperJob({ segments: [] })) as any);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ status: 'done' })
);
});
it('writes empty outputs for an empty segment array', async () => {
mockGetJob.mockReturnValue(makeJob('job-empty-2'));
await POST(makeEvent('job-empty-2', makeWhisperJob({ segments: [] })) as any);
expect(mockWriteOutputs).toHaveBeenCalledWith([], expect.any(String), 'job-empty-2');
});
it('still sends a push notification for empty transcription', async () => {
mockGetJob.mockReturnValue(makeJob('job-empty-3'));
await POST(makeEvent('job-empty-3', makeWhisperJob({ segments: [] })) as any);
expect(mockSendNotification).toHaveBeenCalled();
});
});
// ── Internal error handling ───────────────────────────────────────────────────
describe('POST /api/webhook/[jobId] — internal errors', () => {
it('returns 500 and marks job failed when writeOutputs throws', async () => {
mockGetJob.mockReturnValue(makeJob('job-err-2'));
mockWriteOutputs.mockRejectedValue(new Error('disk full'));
const segments = [makeSeg(0, 'Hello.')];
const res = await POST(makeEvent('job-err-2', makeWhisperJob({ segments })) as any);
expect(res.status).toBe(500);
expect(mockUpdateJob).toHaveBeenCalledWith(
expect.objectContaining({ status: 'failed', error: 'disk full' })
);
});
it('emits an error progress event on internal failure', async () => {
mockGetJob.mockReturnValue(makeJob('job-err-3'));
mockWriteOutputs.mockRejectedValue(new Error('oops'));
const segments = [makeSeg(0, 'Hello.')];
await POST(makeEvent('job-err-3', makeWhisperJob({ segments })) as any);
expect(mockEmitProgress).toHaveBeenCalledWith(
'job-err-3',
expect.objectContaining({ type: 'error' })
);
});
});

219
src/tests/whisper.test.ts Normal file
View File

@@ -0,0 +1,219 @@
import { describe, it, expect, vi, afterEach } from 'vitest';
import { Readable } from 'stream';
// ── Hoist mocks so they're available inside vi.mock() factories ───────────────
const mocks = vi.hoisted(() => ({
fetch: vi.fn(),
append: vi.fn(),
getHeaders: vi.fn(() => ({ 'content-type': 'multipart/form-data; boundary=test' }))
}));
vi.mock('node-fetch', () => ({ default: mocks.fetch }));
// FormData must be a proper constructor (regular function, not arrow function)
vi.mock('form-data', () => ({
default: vi.fn(function (this: Record<string, unknown>) {
this.append = mocks.append;
this.getHeaders = mocks.getHeaders;
})
}));
vi.mock('fs', () => ({ createReadStream: vi.fn(() => 'STREAM_PLACEHOLDER') }));
import { submitJob, streamJob } from '$lib/server/whisper.js';
afterEach(() => vi.clearAllMocks());
// ── submitJob ─────────────────────────────────────────────────────────────────
describe('submitJob', () => {
it('POSTs to /jobs and returns job_id', async () => {
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'whisper-job-abc' })
});
const id = await submitJob('/tmp/audio.wav', 'http://host/api/webhook/job-1');
expect(id).toBe('whisper-job-abc');
});
it('sends a POST request to the configured WHISPER_URL/jobs', async () => {
vi.stubEnv('WHISPER_URL', 'http://localhost:8091');
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'x' })
});
await submitJob('/tmp/audio.wav', 'http://host/api/webhook/job-1');
expect(mocks.fetch).toHaveBeenCalledWith(
'http://localhost:8091/jobs',
expect.objectContaining({ method: 'POST' })
);
vi.unstubAllEnvs();
});
it('includes task=transcribe in the form', async () => {
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'x' })
});
await submitJob('/tmp/audio.wav', 'http://host/webhook');
expect(mocks.append).toHaveBeenCalledWith('task', 'transcribe');
});
it('includes webhook_url in the form', async () => {
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'x' })
});
await submitJob('/tmp/audio.wav', 'http://192.168.1.10:3000/api/webhook/job-99');
expect(mocks.append).toHaveBeenCalledWith(
'webhook_url',
'http://192.168.1.10:3000/api/webhook/job-99'
);
});
it('includes language when provided', async () => {
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'x' })
});
await submitJob('/tmp/audio.wav', 'http://host/webhook', 'en');
expect(mocks.append).toHaveBeenCalledWith('language', 'en');
});
it('omits language field when not provided', async () => {
mocks.fetch.mockResolvedValue({
ok: true,
json: () => Promise.resolve({ job_id: 'x' })
});
await submitJob('/tmp/audio.wav', 'http://host/webhook');
const languageCalls = mocks.append.mock.calls.filter(([name]: string[]) => name === 'language');
expect(languageCalls).toHaveLength(0);
});
it('throws when the server returns a non-2xx status', async () => {
mocks.fetch.mockResolvedValue({
ok: false,
status: 500,
text: () => Promise.resolve('Internal Server Error')
});
await expect(submitJob('/tmp/audio.wav', 'http://host/webhook')).rejects.toThrow('500');
});
});
// ── streamJob SSE parsing ─────────────────────────────────────────────────────
function makeSSEResponse(lines: string[]) {
const body = Readable.from(lines.map((l) => l + '\n'));
return { ok: true, body };
}
describe('streamJob — SSE event parsing', () => {
it('calls onProgress for progress events with percent, chunk, total', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
const onError = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: {"type":"progress","percent":42,"chunk":1,"total":3}',
'',
'data: {"type":"done","job":{}}',
''
])
);
await streamJob('whisper-id', onProgress, onDone, onError);
expect(onProgress).toHaveBeenCalledWith(42, 1, 3);
});
it('calls onDone when a done event is received and stops reading', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
const onError = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: {"type":"done","job":{}}',
'',
// Lines after done should not trigger more callbacks
'data: {"type":"progress","percent":99,"chunk":3,"total":3}',
''
])
);
await streamJob('whisper-id', onProgress, onDone, onError);
expect(onDone).toHaveBeenCalledOnce();
expect(onProgress).not.toHaveBeenCalled();
});
it('calls onError for error events', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
const onError = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: {"type":"error","message":"model crashed"}',
''
])
);
await streamJob('whisper-id', onProgress, onDone, onError);
expect(onError).toHaveBeenCalledWith('model crashed');
expect(onDone).not.toHaveBeenCalled();
});
it('ignores malformed JSON data lines without throwing', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
const onError = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: not-valid-json',
'',
'data: {"type":"done","job":{}}',
''
])
);
await expect(streamJob('whisper-id', onProgress, onDone, onError)).resolves.not.toThrow();
expect(onDone).toHaveBeenCalled();
});
it('handles multiple progress events in sequence', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
const onError = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: {"type":"progress","percent":25,"chunk":1,"total":4}',
'',
'data: {"type":"progress","percent":50,"chunk":2,"total":4}',
'',
'data: {"type":"progress","percent":75,"chunk":3,"total":4}',
'',
'data: {"type":"done","job":{}}',
''
])
);
await streamJob('whisper-id', onProgress, onDone, onError);
expect(onProgress).toHaveBeenCalledTimes(3);
expect(onProgress).toHaveBeenNthCalledWith(1, 25, 1, 4);
expect(onProgress).toHaveBeenNthCalledWith(3, 75, 3, 4);
});
it('defaults chunk and total to 0 when missing from progress event', async () => {
const onProgress = vi.fn();
const onDone = vi.fn();
mocks.fetch.mockResolvedValue(
makeSSEResponse([
'data: {"type":"progress","percent":60}',
'',
'data: {"type":"done","job":{}}',
''
])
);
await streamJob('whisper-id', onProgress, onDone, vi.fn());
expect(onProgress).toHaveBeenCalledWith(60, 0, 0);
});
});

19
static/favicon.svg Normal file
View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32" width="32" height="32">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="32" height="32" rx="7.2" ry="7.2" fill="url(#bg)"></rect>
<rect x="0" y="0" width="32" height="19.2" rx="7.2" ry="7.2" fill="url(#sheen)"></rect>
<rect x="8.8" y="8.64" width="3.52" height="14.72" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<rect x="14.24" y="5.4399999999999995" width="3.52" height="21.12" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<rect x="19.68" y="10.559999999999999" width="3.52" height="10.88" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<circle cx="21.439999999999998" cy="28.96" r="1.12" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 213 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

BIN
static/icons/favicon-16.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 720 B

BIN
static/icons/favicon-32.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

BIN
static/icons/favicon-64.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1024 1024" width="1024" height="1024">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="1024" height="1024" rx="230.4" ry="230.4" fill="url(#bg)"></rect>
<rect x="0" y="0" width="1024" height="614.4" rx="230.4" ry="230.4" fill="url(#sheen)"></rect>
<rect x="281.6" y="276.48" width="112.64" height="471.04" rx="56.32" ry="56.32" fill="#0c0d10"></rect>
<rect x="455.68" y="174.07999999999998" width="112.64" height="675.84" rx="56.32" ry="56.32" fill="#0c0d10"></rect>
<rect x="629.76" y="337.91999999999996" width="112.64" height="348.16" rx="56.32" ry="56.32" fill="#0c0d10"></rect>
<circle cx="686.0799999999999" cy="926.72" r="35.84" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 180 180" width="180" height="180">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="180" height="180" rx="40.5" ry="40.5" fill="url(#bg)"></rect>
<rect x="0" y="0" width="180" height="108" rx="40.5" ry="40.5" fill="url(#sheen)"></rect>
<rect x="49.5" y="48.6" width="19.8" height="82.8" rx="9.9" ry="9.9" fill="#0c0d10"></rect>
<rect x="80.1" y="30.599999999999994" width="19.8" height="118.80000000000001" rx="9.9" ry="9.9" fill="#0c0d10"></rect>
<rect x="110.7" y="59.4" width="19.8" height="61.2" rx="9.9" ry="9.9" fill="#0c0d10"></rect>
<circle cx="120.6" cy="162.9" r="6.300000000000001" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.1 KiB

View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32" width="32" height="32">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="32" height="32" rx="7.2" ry="7.2" fill="url(#bg)"></rect>
<rect x="0" y="0" width="32" height="19.2" rx="7.2" ry="7.2" fill="url(#sheen)"></rect>
<rect x="8.8" y="8.64" width="3.52" height="14.72" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<rect x="14.24" y="5.4399999999999995" width="3.52" height="21.12" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<rect x="19.68" y="10.559999999999999" width="3.52" height="10.88" rx="1.76" ry="1.76" fill="#0c0d10"></rect>
<circle cx="21.439999999999998" cy="28.96" r="1.12" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.1 KiB

View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" width="512" height="512">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="512" height="512" rx="115.2" ry="115.2" fill="url(#bg)"></rect>
<rect x="0" y="0" width="512" height="307.2" rx="115.2" ry="115.2" fill="url(#sheen)"></rect>
<rect x="140.8" y="138.24" width="56.32" height="235.52" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<rect x="227.84" y="87.03999999999999" width="56.32" height="337.92" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<rect x="314.88" y="168.95999999999998" width="56.32" height="174.08" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<circle cx="343.03999999999996" cy="463.36" r="17.92" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

@@ -0,0 +1,19 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" width="512" height="512">
<defs>
<linearGradient id="bg" x1="0" y1="0" x2="1" y2="1">
<stop offset="0%" stop-color="#ffb18a"></stop>
<stop offset="55%" stop-color="#ff8a5c"></stop>
<stop offset="100%" stop-color="#e8612d"></stop>
</linearGradient>
<linearGradient id="sheen" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#fff" stop-opacity="0.35"></stop>
<stop offset="55%" stop-color="#fff" stop-opacity="0"></stop>
</linearGradient>
</defs>
<rect x="0" y="0" width="512" height="512" fill="#ff8a5c"></rect>
<rect x="140.8" y="138.24" width="56.32" height="235.52" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<rect x="227.84" y="87.03999999999999" width="56.32" height="337.92" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<rect x="314.88" y="168.95999999999998" width="56.32" height="174.08" rx="28.16" ry="28.16" fill="#0c0d10"></rect>
<circle cx="343.03999999999996" cy="463.36" r="17.92" fill="#0c0d10"></circle>
</svg>

After

Width:  |  Height:  |  Size: 1.0 KiB

44
static/manifest.json Normal file
View File

@@ -0,0 +1,44 @@
{
"name": "Tonemark",
"short_name": "Tonemark",
"description": "Fast audio and video transcription powered by Whisper",
"start_url": "/",
"display": "standalone",
"background_color": "#0c0d10",
"theme_color": "#0c0d10",
"orientation": "any",
"share_target": {
"action": "/share",
"method": "POST",
"enctype": "multipart/form-data",
"params": {
"title": "title",
"text": "text",
"url": "url",
"files": [
{
"name": "file",
"accept": ["audio/*", "video/*"]
}
]
}
},
"icons": [
{
"src": "/icons/android-icon-192.png",
"sizes": "192x192",
"type": "image/png"
},
{
"src": "/icons/android-icon-512.png",
"sizes": "512x512",
"type": "image/png"
},
{
"src": "/icons/tonemark-icon-android-fg.svg",
"sizes": "any",
"type": "image/svg+xml",
"purpose": "maskable"
}
]
}

3
static/robots.txt Normal file
View File

@@ -0,0 +1,3 @@
# allow crawling everything by default
User-agent: *
Disallow:

13
svelte.config.js Normal file
View File

@@ -0,0 +1,13 @@
import adapter from '@sveltejs/adapter-node';
/** @type {import('@sveltejs/kit').Config} */
const config = {
compilerOptions: {
runes: ({ filename }) => (filename.split(/[/\\]/).includes('node_modules') ? undefined : true)
},
kit: {
adapter: adapter({ out: 'build' })
}
};
export default config;

20
tsconfig.json Normal file
View File

@@ -0,0 +1,20 @@
{
"extends": "./.svelte-kit/tsconfig.json",
"compilerOptions": {
"rewriteRelativeImportExtensions": true,
"allowJs": true,
"checkJs": true,
"esModuleInterop": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"skipLibCheck": true,
"sourceMap": true,
"strict": true,
"moduleResolution": "bundler"
}
// Path aliases are handled by https://svelte.dev/docs/kit/configuration#alias
// except $lib which is handled by https://svelte.dev/docs/kit/configuration#files
//
// To make changes to top-level options such as include and exclude, we recommend extending
// the generated config; see https://svelte.dev/docs/kit/configuration#typescript
}

7
vite.config.ts Normal file
View File

@@ -0,0 +1,7 @@
import { sveltekit } from '@sveltejs/kit/vite';
import tailwindcss from '@tailwindcss/vite';
import { defineConfig } from 'vite';
export default defineConfig({
plugins: [tailwindcss(), sveltekit()]
});

20
vitest.config.ts Normal file
View File

@@ -0,0 +1,20 @@
import { defineConfig } from 'vitest/config';
import path from 'path';
export default defineConfig({
test: {
environment: 'node',
globals: true,
include: ['src/tests/**/*.test.ts'],
coverage: {
provider: 'v8',
reporter: ['text', 'html'],
include: ['src/lib/server/**/*.ts', 'src/routes/api/**/*.ts']
}
},
resolve: {
alias: {
'$lib': path.resolve('./src/lib')
}
}
});