Voice Relay
Voice relay lets you dictate into a Claude Code session instead of typing. It captures audio from your microphone, transcribes it locally using a Whisper model, and injects the resulting text into the attended pane of any workspace. All audio processing happens on your machine, so nothing leaves the local network except the transcribed text.
Voice relay works with every workspace type: local Zellij sessions, Podman containers, Compose environments, SSH remotes, and Kubernetes deployments. The same command, the same TUI, regardless of where the session runs.
Prerequisites
You need two things before voice relay can run: the whisper.cpp server binary and a Whisper model file.
Install whisper.cpp via Homebrew:
brew install whisper-cpp
This provides whisper-server, which cc-deck starts and manages automatically.
You do not need to run it yourself.
Then run the setup command to download a model and verify dependencies:
cc-deck ws voice --setup
Setup checks for whisper-server in your PATH, downloads the default model (base.en, about 141 MB), and reports whether everything is ready.
If you want a different model, pass --model during setup: cc-deck ws voice --setup --model small.en.
See Whisper Models for a comparison of available models.
|
Quick Start
The fastest path from zero to voice dictation takes three commands.
Create a workspace (if you do not already have one):
cc-deck ws new my-project --type local
In one terminal, attach to the workspace and start a Claude Code session:
cc-deck ws attach my-project
In a second terminal, start voice relay:
cc-deck ws voice my-project
The voice TUI opens with an audio level meter and a scrollable transcript. Start speaking and your words appear in the Claude Code input of the attended pane.
To submit the prompt, say "send" as a standalone word.
cc-deck recognizes this as a command word and sends a newline instead of the literal text.
To cycle to the next session needing attention, say "next" as a standalone word.
This triggers the same tiered attend logic as Alt+a.
Both command words can be configured in ~/.config/cc-deck/config.yaml.
Press q to quit the voice TUI and shut down the whisper server.
Voice Activity Detection
Voice relay uses VAD (voice activity detection) mode. The microphone listens continuously, and cc-deck detects when you start and stop speaking based on audio energy levels. Each utterance is automatically segmented, transcribed, and delivered.
The threshold setting controls how much ambient noise is tolerated before speech is detected.
Lower values make detection more sensitive (quieter speech is captured), while higher values require louder input.
Use the + and - keys in the TUI to adjust the threshold in real time.
Silence of 1.5 seconds ends an utterance and triggers transcription. Utterances are capped at 30 seconds, which is the maximum window Whisper can process in a single pass.
Sidebar Integration
When voice relay is running, a ♫ indicator appears in the sidebar header.
-
Bright green ♫: Voice relay is connected and listening.
-
Dim ♫: Voice relay is connected but muted.
-
No ♫: Voice relay is not connected.
Mute Toggle
You can mute and unmute voice relay from multiple locations:
From the sidebar (any pane):
-
Alt+m: Toggle mute (configurable viavoice_keyin plugin config). -
min navigation mode: Toggle mute while navigating sessions. -
Click the ♫ indicator: Toggle mute with the mouse.
From the voice TUI:
-
m: Toggle mute.
Mute state synchronizes bidirectionally between the sidebar and voice TUI. When you mute from the sidebar, the TUI reflects the change within about one second. When you mute from the TUI, the sidebar updates immediately.
Whisper Models
cc-deck supports four Whisper model sizes.
All are English-optimized except medium, which also handles other languages.
| Model | Size | Speed | Accuracy | Notes |
|---|---|---|---|---|
|
74 MB |
Fastest |
Low |
Good for testing. Not recommended for real dictation. |
|
141 MB |
Fast |
Decent |
The default. Reasonable accuracy for clear speech in a quiet room. |
|
465 MB |
Moderate |
Good |
Best tradeoff on Apple Silicon. Noticeably better with technical terms. |
|
1.5 GB |
Slower |
High |
Multilingual support. Most accurate, but higher latency per utterance. |
On an Apple Silicon Mac, small.en is the recommended upgrade from the default.
It runs comfortably in real time and handles code-related vocabulary (function names, library names, technical jargon) much better than base.en.
|
To switch models:
cc-deck ws voice --setup --model small.en
cc-deck ws voice my-project --model small.en
Models are cached at ~/.cache/cc-deck/models/ and only downloaded once.
The Voice TUI
When voice relay starts, it opens a terminal interface that shows the current state of the audio pipeline.
What You See
The header displays the target workspace name, the active audio device, the current mode (VAD or MUTED), and a real-time audio level meter. The level meter uses a braille-block visualization with a threshold indicator, so you can see at a glance whether your voice is loud enough to trigger detection.
The main area shows a scrollable transcript history. Each entry includes a timestamp, delivery status icon, and the transcribed text:
-
+(green): Text was delivered to the attended pane. -
!(red): Delivery failed. -
~(grey): Transcription is in progress.
The footer shows available keyboard shortcuts.
Configuration
Config File
The VAD threshold can be set in ~/.config/cc-deck/config.yaml so you do not have to pass --threshold every time:
defaults:
voice:
threshold: 45
The threshold uses a logarithmic scale from 0 to 100. A value around 30 to 50 works well for most quiet indoor environments.
CLI Flags
All voice settings can be overridden per invocation:
| Flag | Default | Description |
|---|---|---|
|
|
Whisper model name. |
|
|
Port for the local whisper-server. |
|
(from config) |
VAD sensitivity (0-100, logarithmic). Overrides the config file value. |
|
|
Write detailed diagnostic logs to |
Command Words
Voice relay recognizes two default command words:
| Word | Action | Description |
|---|---|---|
|
|
Submits the current prompt (sends a newline to the attended pane) |
|
|
Cycles to the next session needing attention (same as |
A command word only fires when the entire utterance, after stripping filler words, equals the trigger word. Saying "the next step is to refactor" is treated as normal dictation because "next" is not standalone.
You can configure additional trigger words or define new actions in the config file:
defaults:
voice:
commands:
submit:
- send
- done
- enter
attend:
- next
- switch
Each key under commands is an action name, and the list contains trigger words for that action.
When you configure an action, your word list replaces the default words for that action.
Actions you do not configure keep their defaults.
Filler words (um, uh, hmm) are stripped before matching, so "um, send" still triggers the command.
Troubleshooting
- whisper-server not found
-
Install whisper.cpp:
brew install whisper-cpp. Then runcc-deck ws voice --setupto verify the installation. - Timed out waiting for whisper-server
-
The server takes a few seconds to load the model into memory. Larger models (especially
mediumat 1.5 GB) need more time. If this persists, check whether another process is using port 8234 and try a different port with--port. - Transcription misses quiet speech
-
Lower the VAD threshold with the
-key in the TUI, or set a lower value in the config file. A threshold of 20 to 30 captures softer speech at the cost of picking up more background noise. - Filler words appear in output
-
cc-deck strips common fillers (um, uh, hmm, ah, er) automatically. If you see them, the issue may be that the filler was part of a longer phrase rather than standalone.
- Text does not appear in Claude Code
-
Verify that the workspace has an active Zellij session and that a pane is attended. Check the transcript history in the TUI for delivery errors (the
!indicator).
Next Steps
-
The Sidebar for understanding session states and the attended pane
-
Smart Attend for how cc-deck picks which pane receives voice input
-
Configuration for custom keybindings and layout variants
-
CLI Reference for the complete
ws voiceflag reference