SantiSotto

Privacy-first voice transcription for macOS

🔒 100% Local Processing ⚡ Metal GPU Accelerated 🎧 Powered by OpenAI Whisper

Built for Speed and Privacy

0
Data sent to cloud
30
AI models available
99
Languages supported
<1s
Transcription latency*

*With tiny models on Apple Silicon

Three Steps. Zero Cloud.

Hold your hotkey to record, release to transcribe. Text appears at your cursor instantly. The entire pipeline runs on your Mac.

🎤

Hold Hotkey

Press Option+Space to start recording from your microphone

🎬

Capture Audio

Audio captured at 48kHz, resampled to 16kHz in real-time

🧠

Whisper AI

Local Whisper model transcribes using Metal GPU acceleration

Post-Process

Remove fillers, clean punctuation, apply dictionary bias

📋

Paste Text

Text inserted at cursor via clipboard, original clipboard restored

Everything You Need

Designed to be invisible until you need it. Runs in your menu bar, stays out of your way.

🔒

Complete Privacy

All audio processing happens on-device. No internet required for transcription. No data collection, no telemetry, no cloud APIs.

Metal GPU Acceleration

Leverages Apple Silicon's Metal API for hardware-accelerated inference. Transcription in under a second with small models.

🎧

30 Whisper Models

From 31MB tiny to 2.9GB large-v3. Full precision and quantized variants (Q5, Q8). English-only models for extra speed.

📜

Transcription History

Every transcription saved locally. Browse, copy, or delete individual entries. Never lose text if a paste fails.

Custom Hotkeys

Change your recording shortcut to any modifier+key combo. Capture UI with live key detection. Two configurable shortcuts.

🗣

Filler Word Removal

Automatically strips "um", "uh", "er", "you know", "basically" and more. Smart pattern matching avoids false positives.

📚

Personal Dictionary

Add names, technical terms, and jargon that Whisper might not recognize. Biases the model toward your vocabulary.

🌌

99 Languages

Multilingual models auto-detect language. Supports English, Spanish, French, German, Japanese, Arabic, and 93 more.

💻

Menu Bar App

Runs as a macOS accessory — no dock icon, no window clutter. Launches at login. Always ready when you are.

Choose Your Trade-off

Larger models are more accurate but slower. Quantized variants (Q5, Q8) reduce size with minimal quality loss. English-only models are faster for English.

Tiny

31 – 75 MB

Speed ★★★★★

Base

57 – 142 MB

Speed ★★★★

Small

181 – 466 MB

Recommended

Medium

514 MB – 1.5 GB

Accuracy ★★★★

Large v3

547 MB – 2.9 GB

Accuracy ★★★★★

Compared to Alternatives

Most transcription tools send your audio to the cloud. SantiSotto keeps everything on your device.

Feature SantiSotto Otter.ai macOS Dictation Whisper (CLI)
100% Local Processing ✓ Yes ✗ Cloud ● Hybrid ✓ Yes
No Account Required ✓ Yes ✗ Required ✓ Yes ✓ Yes
Works Offline ✓ Yes ✗ No ● Limited ✓ Yes
GPU Accelerated ✓ Metal N/A (cloud) ✓ Yes ● Manual setup
Hotkey Recording ✓ Customizable ✗ No ● Fixed key ✗ No
Auto-Paste at Cursor ✓ Yes ✗ Copy only ✓ Yes ✗ File output
Model Selection ✓ 30 models N/A ✗ Fixed ✓ Manual
Transcription History ✓ Local ✓ Yes ✗ No ✗ No
Filler Word Removal ✓ Automatic ● Basic ✗ No ✗ No
Free ✓ Open source ✗ Subscription ✓ Yes ✓ Open source
GUI / Easy Setup ✓ Native app ✓ Web app ✓ Built-in ✗ Terminal only

Technology Stack

A modern, performant architecture built with native technologies.

Frontend

TypeScript + Vite for fast, type-safe UI development. Lucide icon library for consistent, beautiful iconography. Pure CSS with light and dark mode support.

Backend

Rust for memory-safe, high-performance audio processing and transcription. Tauri v2 framework for native macOS integration with minimal overhead (~5MB binary).

AI Engine

whisper-rs (whisper.cpp bindings) with Metal feature flag for Apple Silicon GPU acceleration. Greedy sampling for lowest latency. Auto language detection for multilingual models.

Audio Pipeline

cpal for cross-platform audio capture. Real-time 48kHz to 16kHz downsampling via sinc-based resampling with rubato. Zero-copy buffer management with privacy-first memory clearing.