Voice Relay

Voice relay lets you dictate into a Claude Code session instead of typing. It captures audio from your microphone, transcribes it locally using a Whisper model, and injects the resulting text into the attended pane of any workspace. All audio processing happens on your machine, so nothing leaves the local network except the transcribed text.

Voice relay works with every workspace type: local Zellij sessions, Podman containers, Compose environments, SSH remotes, and Kubernetes deployments. The same command, the same TUI, regardless of where the session runs.

Prerequisites

You need two things before voice relay can run: the whisper.cpp server binary and a Whisper model file.

Install whisper.cpp via Homebrew:

brew install whisper-cpp

This provides whisper-server, which cc-deck starts and manages automatically. You do not need to run it yourself.

Then run the setup command to download a model and verify dependencies:

cc-deck ws voice --setup

Setup checks for whisper-server in your PATH, downloads the default model (base.en, about 141 MB), and reports whether everything is ready.

If you want a different model, pass --model during setup: cc-deck ws voice --setup --model small.en. See Whisper Models for a comparison of available models.

Quick Start

The fastest path from zero to voice dictation takes three commands.

Create a workspace (if you do not already have one):

cc-deck ws new my-project --type local

In one terminal, attach to the workspace and start a Claude Code session:

cc-deck ws attach my-project

In a second terminal, start voice relay:

cc-deck ws voice my-project

The voice TUI opens with an audio level meter and a scrollable transcript. Start speaking and your words appear in the Claude Code input of the attended pane.

To submit the prompt, say "send" as a standalone word. cc-deck recognizes this as a command word and sends a newline instead of the literal text. To cycle to the next session needing attention, say "next" as a standalone word. This triggers the same tiered attend logic as Alt+a. Both command words can be configured in ~/.config/cc-deck/config.yaml.

Press q to quit the voice TUI and shut down the whisper server.

Voice Activity Detection

Voice relay uses VAD (voice activity detection) mode. The microphone listens continuously, and cc-deck detects when you start and stop speaking based on audio energy levels. Each utterance is automatically segmented, transcribed, and delivered.

The threshold setting controls how much ambient noise is tolerated before speech is detected. Lower values make detection more sensitive (quieter speech is captured), while higher values require louder input. Use the + and - keys in the TUI to adjust the threshold in real time.

Silence of 1.5 seconds ends an utterance and triggers transcription. Utterances are capped at 30 seconds, which is the maximum window Whisper can process in a single pass.

Sidebar Integration

When voice relay is running, a ♫ indicator appears in the sidebar header.

Bright green ♫: Voice relay is connected and listening.
Dim ♫: Voice relay is connected but muted.
No ♫: Voice relay is not connected.

Mute Toggle

You can mute and unmute voice relay from multiple locations:

From the sidebar (any pane):

Alt+m: Toggle mute (configurable via voice_key in plugin config).
m in navigation mode: Toggle mute while navigating sessions.
Click the ♫ indicator: Toggle mute with the mouse.

From the voice TUI:

m: Toggle mute.

Mute state synchronizes bidirectionally between the sidebar and voice TUI. When you mute from the sidebar, the TUI reflects the change within about one second. When you mute from the TUI, the sidebar updates immediately.

Whisper Models

cc-deck supports four Whisper model sizes. All are English-optimized except medium, which also handles other languages.

Model Size Speed Accuracy Notes

Model	Size	Speed	Accuracy	Notes
`tiny.en`	74 MB	Fastest	Low	Good for testing. Not recommended for real dictation.
`base.en`	141 MB	Fast	Decent	The default. Reasonable accuracy for clear speech in a quiet room.
`small.en`	465 MB	Moderate	Good	Best tradeoff on Apple Silicon. Noticeably better with technical terms.
`medium`	1.5 GB	Slower	High	Multilingual support. Most accurate, but higher latency per utterance.

tiny.en

74 MB

Fastest

Low

Good for testing. Not recommended for real dictation.

base.en

141 MB

Fast

Decent

The default. Reasonable accuracy for clear speech in a quiet room.

small.en

465 MB

Moderate

Good

Best tradeoff on Apple Silicon. Noticeably better with technical terms.

medium

1.5 GB

Slower

High

Multilingual support. Most accurate, but higher latency per utterance.

On an Apple Silicon Mac, small.en is the recommended upgrade from the default. It runs comfortably in real time and handles code-related vocabulary (function names, library names, technical jargon) much better than base.en.

To switch models:

cc-deck ws voice --setup --model small.en
cc-deck ws voice my-project --model small.en

Models are cached at ~/.cache/cc-deck/models/ and only downloaded once.

The Voice TUI

When voice relay starts, it opens a terminal interface that shows the current state of the audio pipeline.

What You See

The header displays the target workspace name, the active audio device, the current mode (VAD or MUTED), and a real-time audio level meter. The level meter uses a braille-block visualization with a threshold indicator, so you can see at a glance whether your voice is loud enough to trigger detection.

The main area shows a scrollable transcript history. Each entry includes a timestamp, delivery status icon, and the transcribed text:

+ (green): Text was delivered to the attended pane.
! (red): Delivery failed.
~ (grey): Transcription is in progress.

The footer shows available keyboard shortcuts.

Keyboard Controls

Key Action

Key	Action
`q`	Quit voice relay and stop the whisper server.
`m`	Toggle mute/unmute.
`+` / `-`	Increase or decrease VAD sensitivity by 2% per press.
`d`	Open the audio device picker.
`r`	Start recording, pause, or resume transcript recording.
`R` (Shift+R)	Stop transcript recording and close the file.
`PgUp` / `PgDn`	Scroll through transcript history.

q

Quit voice relay and stop the whisper server.

m

Toggle mute/unmute.

+ / -

Increase or decrease VAD sensitivity by 2% per press.

d

Open the audio device picker.

r

Start recording, pause, or resume transcript recording.

R (Shift+R)

Stop transcript recording and close the file.

PgUp / PgDn

Scroll through transcript history.

Transcript Recording

Voice relay can record everything you say to a plain text file. Each transcribed utterance is written as a single line, with no timestamps or metadata.

To start recording, press r. A filename prompt appears at the bottom of the TUI with a default name based on the current timestamp (for example, transcript-2026-05-04T15-04-05.txt). Type a custom name or press Enter to accept the default. Press Escape to cancel without starting a recording.

While recording, a red REC indicator appears in the header next to the mode display. Every transcription is appended to the file as it arrives.

To pause recording, press r again. The indicator changes to a yellow pause symbol. Transcriptions continue to appear in the TUI history but are not written to the file. Press r once more to resume recording.

To stop recording and close the file, press R (Shift+R). The indicator disappears and the file is finalized. Quitting the TUI with q also closes the file properly.

Relative filenames are stored in ~/.local/share/cc-deck/transcripts/. Absolute paths are used as-is.

Recording While Muted

When the relay is muted and a recording is active, voice relay continues to transcribe speech and write it to the transcript file. The transcribed text is not sent to the attended pane. This lets you capture notes or commentary that you do not want delivered to the active session.

When no recording is active and the relay is muted, audio is discarded without transcription, preserving the existing mute behavior.

Configuration

Config File

The VAD threshold can be set in ~/.config/cc-deck/config.yaml so you do not have to pass --threshold every time:

defaults:
  voice:
    threshold: 45

The threshold uses a logarithmic scale from 0 to 100. A value around 30 to 50 works well for most quiet indoor environments.

CLI Flags

All voice settings can be overridden per invocation:

Flag Default Description

Flag	Default	Description
`--model`	`base.en`	Whisper model name.
`--port`	`8234`	Port for the local whisper-server.
`--threshold`	(from config)	VAD sensitivity (0-100, logarithmic). Overrides the config file value.
`--verbose`	`false`	Write detailed diagnostic logs to `~/.local/state/cc-deck/voice.log`.

--model

base.en

Whisper model name.

--port

8234

Port for the local whisper-server.

--threshold

(from config)

VAD sensitivity (0-100, logarithmic). Overrides the config file value.

--verbose

false

Write detailed diagnostic logs to ~/.local/state/cc-deck/voice.log.

Command Words

Voice relay recognizes two default command words:

Word Action Description

Word	Action	Description
`send`	`submit`	Submits the current prompt (sends a newline to the attended pane)
`next`	`attend`	Cycles to the next session needing attention (same as `Alt+a`)

send

submit

Submits the current prompt (sends a newline to the attended pane)

next

attend

Cycles to the next session needing attention (same as Alt+a)

A command word only fires when the entire utterance, after stripping filler words, equals the trigger word. Saying "the next step is to refactor" is treated as normal dictation because "next" is not standalone.

You can configure additional trigger words or define new actions in the config file:

defaults:
  voice:
    commands:
      submit:
        - send
        - done
        - enter
      attend:
        - next
        - switch

Each key under commands is an action name, and the list contains trigger words for that action. When you configure an action, your word list replaces the default words for that action. Actions you do not configure keep their defaults. Filler words (um, uh, hmm) are stripped before matching, so "um, send" still triggers the command.

Diagnostic Logging

Pass --verbose to enable per-utterance latency logging and backend health information. Logs are written to ~/.local/state/cc-deck/voice.log, not to the terminal. This is useful when debugging transcription accuracy or delivery failures.

Troubleshooting

whisper-server not found: Install whisper.cpp: brew install whisper-cpp. Then run cc-deck ws voice --setup to verify the installation.
Timed out waiting for whisper-server: The server takes a few seconds to load the model into memory. Larger models (especially medium at 1.5 GB) need more time. If this persists, check whether another process is using port 8234 and try a different port with --port.
Transcription misses quiet speech: Lower the VAD threshold with the - key in the TUI, or set a lower value in the config file. A threshold of 20 to 30 captures softer speech at the cost of picking up more background noise.
Filler words appear in output: cc-deck strips common fillers (um, uh, hmm, ah, er) automatically. If you see them, the issue may be that the filler was part of a longer phrase rather than standalone.
Text does not appear in Claude Code: Verify that the workspace has an active Zellij session and that a pane is attended. Check the transcript history in the TUI for delivery errors (the ! indicator).

Next Steps

The Sidebar for understanding session states and the attended pane
Smart Attend for how cc-deck picks which pane receives voice input
Configuration for custom keybindings and layout variants
CLI Reference for the complete ws voice flag reference