Your personal voice assistant that types for you β no internet required.
Press a hotkey, speak, and watch your words appear in any app. It's like magic, but it's just really good technology.
- Two Hotkeys, Two Superpowers:
Cmd+Shift+Vβ Quick voice-to-text (super fast β‘)Cmd+Shift+Oβ Voice + screenshot OCR (context-aware, catches UI text, app names, buttons, etc.)
- Smart Text Fixer β Auto-corrects grammar & punctuation using a tiny local AI model (0.5s, not 15s)
- Native OCR β Apple Vision Framework reads your screen (zero extra RAM, zero cloud)
- Offline & Private β Everything happens on your Mac. Your voice never leaves your computer.
- Auto-Paste β Types the transcribed text directly into whatever app you're using
- Floating UI β Minimal recording window that plays nice with fullscreen apps
- Go to Releases and grab:
- Apple Silicon Mac (M1/M2/M3/M4):
ASTRA-x.x.x-arm64.dmg - Intel Mac:
ASTRA-x.x.x.dmg
- Apple Silicon Mac (M1/M2/M3/M4):
- Drag to Applications folder
- First time? macOS may block the app because this project is open source and not Apple-notarized. To open it:
- Try right-clicking ASTRA in Applications and choose Open
- If macOS still blocks it, go to System Settings β Privacy & Security
- Scroll to Security and click Open Anyway for ASTRA
- Launch ASTRA again
On first run, the app will:
- Ask permission for microphone ποΈ
- Ask permission for Accessibility so it can paste text and read selected text
- Download a speech model (~400MB, one-time)
- Optionally install Ollama for text polishing (we'll cover this below)
ASTRA's downloadable builds are currently ad-hoc signed, not Apple-notarized. This means macOS may show a warning on first launch. The source code is public, and technical users can build it themselves if they prefer.
To verify a downloaded DMG, compare its SHA256 checksum with the checksums published in RELEASE_CHECKSUMS.txt:
shasum -a 256 ASTRA-2.0.0-arm64.dmg| What You Want | Press | What Happens |
|---|---|---|
| Quick voice note | Cmd+Shift+V |
Speak β Stop β Text appears in your app |
| Voice + screen context | Cmd+Shift+O |
Speak β Screenshot β OCR context β Text appears |
Tip: Press
Escapeduring recording to cancel.
Want your transcribed text to be grammar-perfect? Install Ollama β a tiny local AI that runs on your Mac.
# Install (if you haven't)
brew install ollama
# Start it (runs in background)
ollama serve
# Pull our recommended model (397MB, ~0.5s per polish)
ollama pull qwen2.5:0.5bThat's it! The app automatically connects to Ollama and polishes your text.
| Model | Size | Speed | Verdict |
|---|---|---|---|
| qwen2.5:0.5b | 397MB | ~0.5s | β Perfect |
| qwen2.5:1.5b | 1GB | ~1s | Good |
| phi4-mini | 2.5GB | ~9s | Too slow |
| gemma4 | 7GB | ~15s | Thinking mode = unusable |
If you want to try other models, just run ollama pull <model-name> and change it in Settings.
Click the Tray Icon (in menu bar) β Settings to customize:
| Option | What It Does |
|---|---|
| Hotkeys | Change Cmd+Shift+V / Cmd+Shift+O to whatever you like |
| Auto-Paste | Toggle automatic typing into your active app |
| Polish Mode | Turn AI text fixing on/off |
| Ollama URL | Usually http://localhost:11434 (don't change unless you know what you're doing) |
- Your Voice β Recorded via macOS microphone
- Whisper.cpp β Transcribes locally (no cloud, complete privacy)
- Ollama (optional) β Fixes grammar/punctuation in ~0.5s
- Auto-Type β Uses Mac's accessibility APIs to type into your active app
For Vision Mode (Cmd+Shift+O):
- Screenshot β Captured via
screencapture - Apple Vision Framework β Extracts text from screen (native, zero RAM)
- LLM Context β Uses extracted text to correctly handle UI elements, button names, etc.
- Make sure you have
whisper-cliin your PATH, or use the pre-built DMG which includes it.
- Run
ollama servein Terminal - Check Settings β Ollama URL is
http://localhost:11434 - Run
ollama listto see available models
- Go to System Settings β Privacy & Security β Accessibility and enable ASTRA
- Open an issue β we'll help!
git clone https://github.com/amateur-dev/local-hotkey-voice-mac-app.git
cd local-hotkey-voice-mac-app
npm install
npm startRequirements:
- Node.js 18+
- FFmpeg (
brew install ffmpeg) - whisper-cli in PATH
- Electron β Desktop app framework
- Whisper.cpp β Local speech-to-text (OpenAI's Whisper, but faster)
- Ollama β Local LLM with MLX optimization for Apple Silicon (2-4x faster on M1/M2/M3/M4)
- Apple Vision Framework β Native OCR (zero RAM)
- Node.js β Backend magic
- β Star the repo if it made your life easier
- π Report bugs at Issues
- π‘ Suggest features β we're all ears!
Dipesh Sukhani
- π dipeshsukhani.dev
- π§ me@dipeshsukhani.dev
- πΌ LinkedIn
- π¦ @dipesh_sukhani
- π @amateur-dev
MIT License β Use it, break it, improve it. That's the spirit. π

