CamoVoice: Private Speech to Text

January 13, 2026

For most, speaking is faster and more efficient than typing. For some, speech is the only practical method for creating long-form text.

“The biggest bottleneck to AI productivity isn’t the AI; it’s how fast humans can type. The limiting factors are how fast you can type prompts and how quickly you can review AI-generated work.” - Alex Embiricos, OpenAI Codex product lead. 1

CamoVoice is a fully offline, privacy-focused speech-to-text app. No internet connection required and zero telemetry.


CamoVoice User

Along with general privacy considerations for the content you dictate, speech-to-text has a unique privacy problem: your voice is biometric data. Cloud or remote speech-to-text services can require you to upload that biometric signal to someone else’s servers, where it can be processed, retained, or logged (including Microsoft, who improves its speech recognition models with your voice by default).

Why a Dedicated App (Instead of Inline Transcription)

Many people first encounter speech-to-text as a feature embedded inside a website, a chat interface, or an native operating system tool. While convenient, they carry avoidable risks (beyond the third party processing risks mentioned above) that a dedicated app can address:

  • Boundary clarity: a dedicated app makes it obvious where recording starts, where transcription happens, and where the text goes next.
  • Editing and control: dictation often requires quick edits, especially when involving PII and other sensitive, unique content. CamoVoice is built around an editable transcription area, not a one-shot “voice message.”
  • Website input logs: some web apps can capture keystrokes and input events as you type (even before you click “Send”). CamoVoice keeps the capture and transcription layer local until you decide to paste or export.
  • Timestamps and audit-friendly output: CamoVoice can attach unambiguous timestamps to transcription segments and export clean records.
CamoVoice in action

Privacy by Design

Like CamoText's other software, CamoVoice is built on a straightforward privacy commitment:

  • 100% Offline: transcription runs locally on your machine (no internet required after install).
  • Zero Telemetry: no analytics, no usage tracking, no audio samples sent anywhere.
  • No Accounts: no sign-ups, no authentication, no cloud dependencies.
  • Local settings: preferences are stored in a simple settings.json file on your device.

If you want the technical detail, CamoVoice uses optimized versions of faster-whisper (a CTranslate2-based Whisper implementation) to run speech recognition locally, with bundled models so the app can operate offline, even on laptops without a GPU. 2

How CamoVoice Fits Into a Privacy-First Workflow

The simplest way to think about CamoVoice is as a private input layer. You can dictate sensitive material, review it locally, then decide where it goes next (email, a case file, a patient's medical notes, an AI assistant, etc.).


1

Record


Click the mic or hold spacebar.

2

Transcribe


Runs fully offline with bundled models.

3

Review & Edit


Fix or add wording, run more recordings/transcriptions.

4

Export


Copy or save as (+ optional timestamps) for further use.


Designed for Everyone, Accessibility-First

CamoVoice is designed for all users, including accessibility-first workflows:

  • Simple controls: straightforward buttons for recording, copying, saving, and clearing.
  • Scalable text size: a slider adjusts font sizing across the entire app UI.
  • Text-to-speech playback: read transcriptions aloud using voices installed on your system.
  • Keyboard shortcuts: hold spacebar to record; Ctrl+S / ⌘S to save; Ctrl+Z / ⌘Z to undo clear.

Easily stack dictations on top of each other to create a single long document, or separate transcriptions by speaker or topic.

Fast vs Thinking Mode

CamoVoice includes two transcription modes so you can choose speed vs accuracy:

  • Fast mode: optimized for quick notes and fast AI prompting.
  • Thinking mode: optimized for higher accuracy on important transcriptions (complex vocabulary, accents, background noise).

Both modes run locally; Fast mode has higher input file and recording time limits due to the smaller model size, but Thinking mode is more accurate for unique content and environments. Switching modes can take a moment because a different model must be loaded. 2

Summary

CamoVoice exists because speech is often the fastest way to create usable text, but voice data deserves a stronger privacy boundary than cloud transcription can offer. With a fully offline dictation workflow, CamoVoice keeps your recordings and transcriptions local until you decide otherwise. Future intended features include private voice cloning/translation, and additional format support.