DropVox 1.0: Privacy-First Voice Transcription for Mac
24 days. That's how long it took to go from the first Swift commit to a signed, notarized, commercially available macOS application. DropVox 1.0 is live.
Here's what it is, why it exists, and what the journey looked like.
The Problem
Every cloud transcription service asks you to upload your audio to their servers. Your voice messages from family. Your meeting recordings. Your medical notes. All sent to a third party that stores it, processes it, and in many cases uses it to train their models.
The alternative used to be "don't transcribe at all." That's not good enough anymore.
AI models like Whisper can now run entirely on your laptop. The accuracy rivals cloud services. The speed on Apple Silicon is often faster than cloud APIs because there's no network round trip. The only thing missing was an app that made this accessible to people who don't live in a terminal.
That's DropVox.
What DropVox Does
DropVox lives in your macOS menu bar. It transcribes audio files using AI that runs entirely on your machine. No internet required. No data leaves your Mac. Ever.
Drag and drop. Press Cmd+D for a floating drop zone. Drag any audio file onto it from Finder, WhatsApp desktop, Telegram, or anywhere else. Transcription starts immediately.
Clipboard paste. Copy an audio file in Finder, switch to DropVox, press Cmd+V. Done.
Multiple formats. WhatsApp's .opus files, .mp3, .m4a, .wav. The formats people actually encounter in daily life.
13 languages. English, Portuguese, Spanish, French, German, Italian, Dutch, Japanese, Korean, Chinese, Russian, Arabic, and Hindi. No configuration needed -- the model detects the language automatically.
5 model sizes. From Tiny (75MB, fastest) to Large (3GB, most accurate). Choose the balance that fits your hardware and your needs.
Searchable history. Every transcription is saved locally. Search by filename, text content, or language. No more re-transcribing files you already processed last week.
Why Local AI Matters
This isn't a philosophical position. It's a practical one.
Speed. WhisperKit uses Apple Silicon's Neural Engine. A two-minute voice message transcribes in under 10 seconds on an M1 MacBook. No upload time. No queue. No waiting for a server on another continent.
Cost. Cloud transcription services charge $15-30 per month. DropVox Pro is a one-time purchase. Your Mac is already doing the work -- there's nothing to bill monthly for.
Reliability. No internet? No problem. DropVox works on airplanes, in basements, in rural areas with spotty connectivity. The model is on your machine. It doesn't need permission from a server to function.
Privacy. Zero network requests. Zero telemetry. Zero data collection. The app cannot send your audio anywhere because it contains no code to do so. For journalists, healthcare professionals, lawyers, or anyone handling sensitive communications, this isn't a nice-to-have. It's a requirement.
Free Tier vs Pro
DropVox has a genuine free tier. Not a 7-day trial. Not a crippled version that nags you into paying. A real free tier:
Free: 3 transcriptions per day, 60-second maximum duration. Enough to handle daily voice messages and decide if the app works for you.
Pro ($9.99 USD / R$49.90 BRL): Unlimited transcriptions, unlimited duration, all 5 model sizes, all 13 languages. One-time purchase. Pay once, use it forever.
Why One-Time Pricing
DropVox uses your hardware. There are no servers processing your audio. No API costs that scale with usage. No cloud infrastructure to maintain per user. Charging a monthly subscription for software that runs entirely on your machine would be wrong.
When you buy DropVox Pro, you own it. No renewal anxiety. No "am I getting enough value this month" calculations. It's yours.
Why Multi-Currency
The BRL price is not a direct USD conversion. Purchasing power in Brazil is different, and pricing should reflect that. R$49.90 is a fair price for Brazilian users, and I'd rather have a larger community of users at an honest local price than squeeze a few extra reais from a smaller audience.
The 24-Day Build
DropVox started as a Python prototype in January. A weekend project using rumps and OpenAI's Whisper that proved the concept worked. I used it every day.
But Python had hard limits. No drag-and-drop without fragile PyObjC bindings. Distribution required bundling a Python runtime. Performance wasn't where it needed to be. The menu bar app worked, but it wasn't a product.
On January 22 I started over in Swift.
The rewrite used SwiftUI for the interface, WhisperKit for transcription, and Swift's actor model for thread-safe concurrency. Code signing and notarization through Apple's Developer ID program. Automated builds via GitHub Actions.
The timeline:
- Week 1: Core transcription, model selection, language support
- Week 2: Drop zone, clipboard integration, transcription history
- Week 3: Licensing, payment, code signing, notarization, CI/CD
- Final days: Polish, edge cases, .opus format handling, launch prep
No weekends off. No scope creep. Every feature either shipped or got cut.
The lesson for other indie builders: a working prototype is not wasted work even if you throw away all the code. The Python version validated the idea, revealed the UX priorities, and proved demand existed. The Swift version built on all of that knowledge. The rewrite took 24 days. Without the prototype, it would have taken months of guessing.
What's Next
DropVox 1.0 is the foundation. The roadmap includes:
- Share Extension -- Right-click any audio file in any app, share to DropVox, get the transcription. System-level integration without switching apps.
- Speaker diarization -- Identifying who said what in multi-person recordings.
- Real-time transcription -- Live microphone input with streaming output.
- App Store availability -- Under consideration, pending v1.0 traction data.
Try It
DropVox is available now at dropvox.app.
The free tier costs nothing and requires no account, no email, and no credit card. Download it, transcribe a voice message, see for yourself.
If it becomes part of your daily workflow -- and based on my own usage, it will -- Pro is $9.99 once. That's less than one month of any cloud transcription service, and it works forever.
Your voice messages are private. Your transcription tool should be too.
Questions about DropVox? Feature requests? Reach out on GitHub or visit dropvox.app.