Voice Relay

Voice relay lets you dictate into a Claude Code session instead of typing. It captures audio from your microphone, transcribes it locally using a Whisper model, and injects the resulting text into the attended pane of any workspace. All audio processing happens on your machine, so nothing leaves the local network except the transcribed text.

Voice relay works with every workspace type: local Zellij sessions, Podman containers, Compose environments, SSH remotes, and Kubernetes deployments. The same command, the same TUI, regardless of where the session runs.

Prerequisites

You need two things before voice relay can run: the whisper.cpp server binary and a Whisper model file.

Install whisper.cpp via Homebrew:

brew install whisper-cpp

This provides whisper-server, which cc-deck starts and manages automatically. You do not need to run it yourself.

Then run the setup command to download a model and verify dependencies:

cc-deck ws voice --setup

Setup checks for whisper-server in your PATH, downloads the default model (base.en, about 141 MB), and reports whether everything is ready.

If you want a different model, pass --model during setup: cc-deck ws voice --setup --model small.en. See Whisper Models for a comparison of available models.

Quick Start

The fastest path from zero to voice dictation takes three commands.

Create a workspace (if you do not already have one):

cc-deck ws new my-project --type local

In one terminal, attach to the workspace and start a Claude Code session:

cc-deck ws attach my-project

In a second terminal, start voice relay:

cc-deck ws voice my-project

The voice TUI opens with an audio level meter and a scrollable transcript. Start speaking and your words appear in the Claude Code input of the attended pane.

To submit the prompt, say "send" as a standalone word. cc-deck recognizes this as a command word and sends a newline instead of the literal text. To cycle to the next session needing attention, say "next" as a standalone word. This triggers the same tiered attend logic as Alt+a. Both command words can be configured in ~/.config/cc-deck/config.yaml.

Press q to quit the voice TUI and shut down the whisper server.

Voice Activity Detection

Voice relay uses VAD (voice activity detection) mode. The microphone listens continuously, and cc-deck detects when you start and stop speaking based on audio energy levels. Each utterance is automatically segmented, transcribed, and delivered.

The threshold setting controls how much ambient noise is tolerated before speech is detected. Lower values make detection more sensitive (quieter speech is captured), while higher values require louder input. Use the + and - keys in the TUI to adjust the threshold in real time.

Silence of 1.5 seconds ends an utterance and triggers transcription. Utterances are capped at 30 seconds, which is the maximum window Whisper can process in a single pass.

Sidebar Integration

When voice relay is running, a ♫ indicator appears in the sidebar header.

  • Bright green ♫: Voice relay is connected and listening.

  • Dim ♫: Voice relay is connected but muted.

  • No ♫: Voice relay is not connected.

Mute Toggle

You can mute and unmute voice relay from multiple locations:

From the sidebar (any pane):

  • Alt+m: Toggle mute (configurable via voice_key in plugin config).

  • m in navigation mode: Toggle mute while navigating sessions.

  • Click the ♫ indicator: Toggle mute with the mouse.

From the voice TUI:

  • m: Toggle mute.

Mute state synchronizes bidirectionally between the sidebar and voice TUI. When you mute from the sidebar, the TUI reflects the change within about one second. When you mute from the TUI, the sidebar updates immediately.

Whisper Models

cc-deck supports four Whisper model sizes. All are English-optimized except medium, which also handles other languages.

Model Size Speed Accuracy Notes

tiny.en

74 MB

Fastest

Low

Good for testing. Not recommended for real dictation.

base.en

141 MB

Fast

Decent

The default. Reasonable accuracy for clear speech in a quiet room.

small.en

465 MB

Moderate

Good

Best tradeoff on Apple Silicon. Noticeably better with technical terms.

medium

1.5 GB

Slower

High

Multilingual support. Most accurate, but higher latency per utterance.

On an Apple Silicon Mac, small.en is the recommended upgrade from the default. It runs comfortably in real time and handles code-related vocabulary (function names, library names, technical jargon) much better than base.en.

To switch models:

cc-deck ws voice --setup --model small.en
cc-deck ws voice my-project --model small.en

Models are cached at ~/.cache/cc-deck/models/ and only downloaded once.

The Voice TUI

When voice relay starts, it opens a terminal interface that shows the current state of the audio pipeline.

What You See

The header displays the target workspace name, the active audio device, the current mode (VAD or MUTED), and a real-time audio level meter. The level meter uses a braille-block visualization with a threshold indicator, so you can see at a glance whether your voice is loud enough to trigger detection.

The main area shows a scrollable transcript history. Each entry includes a timestamp, delivery status icon, and the transcribed text:

  • + (green): Text was delivered to the attended pane.

  • ! (red): Delivery failed.

  • ~ (grey): Transcription is in progress.

The footer shows available keyboard shortcuts.

Keyboard Controls

Key Action

q

Quit voice relay and stop the whisper server.

m

Toggle mute/unmute.

+ / -

Increase or decrease VAD sensitivity by 2% per press.

d

Open the audio device picker.

PgUp / PgDn

Scroll through transcript history.

Configuration

Config File

The VAD threshold can be set in ~/.config/cc-deck/config.yaml so you do not have to pass --threshold every time:

defaults:
  voice:
    threshold: 45

The threshold uses a logarithmic scale from 0 to 100. A value around 30 to 50 works well for most quiet indoor environments.

CLI Flags

All voice settings can be overridden per invocation:

Flag Default Description

--model

base.en

Whisper model name.

--port

8234

Port for the local whisper-server.

--threshold

(from config)

VAD sensitivity (0-100, logarithmic). Overrides the config file value.

--verbose

false

Write detailed diagnostic logs to ~/.local/state/cc-deck/voice.log.

Command Words

Voice relay recognizes two default command words:

Word Action Description

send

submit

Submits the current prompt (sends a newline to the attended pane)

next

attend

Cycles to the next session needing attention (same as Alt+a)

A command word only fires when the entire utterance, after stripping filler words, equals the trigger word. Saying "the next step is to refactor" is treated as normal dictation because "next" is not standalone.

You can configure additional trigger words or define new actions in the config file:

defaults:
  voice:
    commands:
      submit:
        - send
        - done
        - enter
      attend:
        - next
        - switch

Each key under commands is an action name, and the list contains trigger words for that action. When you configure an action, your word list replaces the default words for that action. Actions you do not configure keep their defaults. Filler words (um, uh, hmm) are stripped before matching, so "um, send" still triggers the command.

Diagnostic Logging

Pass --verbose to enable per-utterance latency logging and backend health information. Logs are written to ~/.local/state/cc-deck/voice.log, not to the terminal. This is useful when debugging transcription accuracy or delivery failures.

Troubleshooting

whisper-server not found

Install whisper.cpp: brew install whisper-cpp. Then run cc-deck ws voice --setup to verify the installation.

Timed out waiting for whisper-server

The server takes a few seconds to load the model into memory. Larger models (especially medium at 1.5 GB) need more time. If this persists, check whether another process is using port 8234 and try a different port with --port.

Transcription misses quiet speech

Lower the VAD threshold with the - key in the TUI, or set a lower value in the config file. A threshold of 20 to 30 captures softer speech at the cost of picking up more background noise.

Filler words appear in output

cc-deck strips common fillers (um, uh, hmm, ah, er) automatically. If you see them, the issue may be that the filler was part of a longer phrase rather than standalone.

Text does not appear in Claude Code

Verify that the workspace has an active Zellij session and that a pane is attended. Check the transcript history in the TUI for delivery errors (the ! indicator).

Next Steps

  • The Sidebar for understanding session states and the attended pane

  • Smart Attend for how cc-deck picks which pane receives voice input

  • Configuration for custom keybindings and layout variants

  • CLI Reference for the complete ws voice flag reference