Skip to content

amateur-dev/Astra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

137 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ASTRA 🎀✨

Your personal voice assistant that types for you β€” no internet required.

Press a hotkey, speak, and watch your words appear in any app. It's like magic, but it's just really good technology.

macOS Status Downloads

⬇️ Download for macOS

ASTRA Main UI Β Β Β  ASTRA Settings UI

What's This Thing Do?

  • Two Hotkeys, Two Superpowers:
    • Cmd+Shift+V β†’ Quick voice-to-text (super fast ⚑)
    • Cmd+Shift+O β†’ Voice + screenshot OCR (context-aware, catches UI text, app names, buttons, etc.)
  • Smart Text Fixer β†’ Auto-corrects grammar & punctuation using a tiny local AI model (0.5s, not 15s)
  • Native OCR β†’ Apple Vision Framework reads your screen (zero extra RAM, zero cloud)
  • Offline & Private β†’ Everything happens on your Mac. Your voice never leaves your computer.
  • Auto-Paste β†’ Types the transcribed text directly into whatever app you're using
  • Floating UI β†’ Minimal recording window that plays nice with fullscreen apps

Quick Start

1️⃣ Download & Install

  1. Go to Releases and grab:
    • Apple Silicon Mac (M1/M2/M3/M4): ASTRA-x.x.x-arm64.dmg
    • Intel Mac: ASTRA-x.x.x.dmg
  2. Drag to Applications folder
  3. First time? macOS may block the app because this project is open source and not Apple-notarized. To open it:
    • Try right-clicking ASTRA in Applications and choose Open
    • If macOS still blocks it, go to System Settings β†’ Privacy & Security
    • Scroll to Security and click Open Anyway for ASTRA
    • Launch ASTRA again

2️⃣ First Launch

On first run, the app will:

  • Ask permission for microphone πŸŽ™οΈ
  • Ask permission for Accessibility so it can paste text and read selected text
  • Download a speech model (~400MB, one-time)
  • Optionally install Ollama for text polishing (we'll cover this below)

About macOS Warnings

ASTRA's downloadable builds are currently ad-hoc signed, not Apple-notarized. This means macOS may show a warning on first launch. The source code is public, and technical users can build it themselves if they prefer.

To verify a downloaded DMG, compare its SHA256 checksum with the checksums published in RELEASE_CHECKSUMS.txt:

shasum -a 256 ASTRA-2.0.0-arm64.dmg

3️⃣ Start Using

What You Want Press What Happens
Quick voice note Cmd+Shift+V Speak β†’ Stop β†’ Text appears in your app
Voice + screen context Cmd+Shift+O Speak β†’ Screenshot β†’ OCR context β†’ Text appears

Tip: Press Escape during recording to cancel.


The "Smart Polish" Feature (Optional but πŸ”₯)

Want your transcribed text to be grammar-perfect? Install Ollama β€” a tiny local AI that runs on your Mac.

Setup Ollama

# Install (if you haven't)
brew install ollama

# Start it (runs in background)
ollama serve

# Pull our recommended model (397MB, ~0.5s per polish)
ollama pull qwen2.5:0.5b

That's it! The app automatically connects to Ollama and polishes your text.

Why qwen2.5:0.5b?

Model Size Speed Verdict
qwen2.5:0.5b 397MB ~0.5s ⭐ Perfect
qwen2.5:1.5b 1GB ~1s Good
phi4-mini 2.5GB ~9s Too slow
gemma4 7GB ~15s Thinking mode = unusable

If you want to try other models, just run ollama pull <model-name> and change it in Settings.


Settings & Config

Click the Tray Icon (in menu bar) β†’ Settings to customize:

Option What It Does
Hotkeys Change Cmd+Shift+V / Cmd+Shift+O to whatever you like
Auto-Paste Toggle automatic typing into your active app
Polish Mode Turn AI text fixing on/off
Ollama URL Usually http://localhost:11434 (don't change unless you know what you're doing)

Under the Hood

  1. Your Voice β†’ Recorded via macOS microphone
  2. Whisper.cpp β†’ Transcribes locally (no cloud, complete privacy)
  3. Ollama (optional) β†’ Fixes grammar/punctuation in ~0.5s
  4. Auto-Type β†’ Uses Mac's accessibility APIs to type into your active app

For Vision Mode (Cmd+Shift+O):

  1. Screenshot β†’ Captured via screencapture
  2. Apple Vision Framework β†’ Extracts text from screen (native, zero RAM)
  3. LLM Context β†’ Uses extracted text to correctly handle UI elements, button names, etc.

Troubleshooting

"Whisper not found" or "Library not loaded"

  • Make sure you have whisper-cli in your PATH, or use the pre-built DMG which includes it.

Ollama not responding

  • Run ollama serve in Terminal
  • Check Settings β†’ Ollama URL is http://localhost:11434
  • Run ollama list to see available models

Hotkeys not working?

  • Go to System Settings β†’ Privacy & Security β†’ Accessibility and enable ASTRA

Still stuck?


Build from Source (For Developers)

git clone https://github.com/amateur-dev/local-hotkey-voice-mac-app.git
cd local-hotkey-voice-mac-app
npm install
npm start

Requirements:

  • Node.js 18+
  • FFmpeg (brew install ffmpeg)
  • whisper-cli in PATH

Tech Stack

  • Electron β€” Desktop app framework
  • Whisper.cpp β€” Local speech-to-text (OpenAI's Whisper, but faster)
  • Ollama β€” Local LLM with MLX optimization for Apple Silicon (2-4x faster on M1/M2/M3/M4)
  • Apple Vision Framework β€” Native OCR (zero RAM)
  • Node.js β€” Backend magic

Like This? ❀️

  • ⭐ Star the repo if it made your life easier
  • πŸ› Report bugs at Issues
  • πŸ’‘ Suggest features β€” we're all ears!

Author

Dipesh Sukhani


MIT License β€” Use it, break it, improve it. That's the spirit. πŸš€

About

A local-first, global hotkey voice dictation app for macOS powered by OpenAI Whisper. Fast, private, and open-source.

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors