← saul.link
· 3 min read

My F12 Key

Hold to talk, release to transcribe. A 200-line Python script replaced every note-taking app I've ever tried.

linuxpythonvoicewhispertools

My F12 key doesn’t open dev tools. It opens a microphone.

I hold it down, say what I’m thinking, let go. A couple seconds later the words appear in my Claude Code prompt. Hold F12, talk, release. That’s it.

The thing doing the work is a Python script called SoupaWhisper by Kyle Redelinghuys. It runs as a systemd user service, starts on boot, sits in the background until I need it. When I hold F12, it starts recording through ALSA. When I release, it feeds the audio into faster-whisper (a CTranslate2 port of OpenAI’s Whisper model), copies the result to clipboard, and types it into whatever’s focused using xdotool.

About 200 lines of Python. No Electron app. No cloud API. No subscription. Audio never leaves the machine.

Why

I use it exclusively with Claude Code. That’s the only place it fires. There’s value in writing out ideas by hand, sketching them, thinking through them slowly. But sometimes you don’t want that. Sometimes you just want a brain dump. You have a half-formed idea about how something should work, or you need to describe a bug, or you want to tell Claude what to build next. Typing it out means you’ll self-edit, second-guess the phrasing, restructure the sentence before you’ve even finished the thought. Speaking doesn’t let you do that. You say the thing and move on.

The transcription is imperfect sometimes and that’s fine because the raw thought is usually better than the over-edited version. Claude doesn’t care about your grammar. It cares about your intent. A messy spoken paragraph with the right idea in it beats a carefully typed sentence that lost the idea somewhere in the editing.

The config

# ~/.config/soupawhisper/config.ini
[whisper]
model = base.en
device = cpu
compute_type = int8

[hotkey]
key = f12

[behavior]
auto_type = true
notifications = true

I run the base.en model on CPU. Transcription for a 10-second clip takes about a second. The model loads once at startup and stays in memory. I tried large-v3-turbo on GPU and the accuracy was better, but loading a 3GB model in and out of VRAM every boot wasn’t worth it. base.en on CPU loads in seconds and gets the job done.

The key itself

F12 is technically the browser dev tools shortcut. I don’t mind right-clicking and hitting Inspect when I need it. It’s out of the way, hard to hit by accident, and easy to reach without looking. The physical key on my keyboard is noticeably smoother than the others now. I’ve been using this setup for a few months and it’s become one of those tools that I forget isn’t standard. I’ll sit down at someone else’s computer, reach for F12, and nothing happens.

That’s how you know a tool is good. When its absence feels like something is broken.