You can run NVIDIA Parakeet locally on a Mac today, at about 80ms latency, with no Python and no cloud. It dictates faster than Apple’s own built-in dictation on the same hardware. Here’s how to get it running, and why it’s so fast on Apple Silicon.
What is NVIDIA Parakeet?
Parakeet is a family of speech recognition models published by NVIDIA. The version most Mac users care about is Parakeet-TDT (Token-and-Duration Transducer), a streaming-friendly architecture that predicts tokens and their durations in parallel instead of one at a time. The result is a model that hits real-time speech-to-text with a fraction of the latency of an autoregressive model like Whisper.
The current production release at the time of writing is Parakeet v3. It supports 25 languages and the model file weighs around 2.3 GB. NVIDIA released it under an open license, which is why it’s free to embed in third-party apps.
Why Parakeet runs fast on Apple Silicon
Parakeet is a Conformer-Transducer model. Two architectural choices make it land well on the Apple Neural Engine:
- It’s streaming-native. It consumes audio in chunks and emits tokens as it goes. Whisper, by contrast, was designed for whole-utterance transcription: it expects a fixed audio window and processes it in one pass. Streaming lets text appear word-by-word as you speak.
- It decodes in parallel. The TDT head predicts token + duration jointly, so you don’t pay an autoregressive cost per token. The model can be quantized and converted to Core ML, then dispatched to the Neural Engine for the heavy parts.
The Mac runtime that makes this work is FluidAudio, an open-source Swift port of Parakeet optimized for the Apple Neural Engine. It’s what Dictato ships internally.
The end-to-end number on M-series chips: about 80ms from microphone to typed text. For context, that’s faster than the round-trip to any cloud STT API, and well below the threshold where users perceive lag.
How to run NVIDIA Parakeet on your Mac
There are two paths, depending on what you’re trying to do.
Path 1: Just dictate (recommended)
Install Dictato. It bundles Parakeet, the FluidAudio runtime, and a system-wide dictation overlay that types into any Mac app: Mail, Slack, Xcode, VS Code, Notion, a browser tab, anything that accepts text input.
- Download from dicta.to/download
- Open the app, grant microphone and accessibility permissions
- Pick Parakeet in Engine Settings (or leave it on Auto)
- Press your hotkey, speak, release
That’s the whole flow. There’s no Python to install and no model files to manage by hand. 7-day free trial, then 9.99€ for 2 years, no subscription.
Path 2: Build it yourself
If you want to integrate Parakeet directly into your own Swift code, FluidAudio is the package to look at. It’s MIT-licensed and exposes a clean ParakeetEngine API. You’ll handle audio capture, the engine session, and text output yourself. The model isn’t the hard part, audio buffering and Apple’s accessibility/CGEvent APIs for text injection are. Plan on a weekend.
Path 1: 30 seconds. Path 2: a weekend. If you landed here looking for a “Parakeet Mac app,” path 1 is the right one.
Parakeet vs the alternatives on Mac
A quick orientation if you’re choosing between Mac dictation engines:
| Parakeet | WhisperKit | Apple SpeechAnalyzer | Qwen3-ASR | |
|---|---|---|---|---|
| Latency on Apple Silicon | ~80ms | 150-300ms | 150-400ms | 200-400ms |
| Languages | 25 | 99 | 20 | 30 |
| Best for | Live English dictation | Multilingual / rare languages | macOS 26 built-in flow | 30-language native hints |
| Model size | ~2.3 GB | ~600 MB | system | ~600 MB |
| Runs offline | yes | yes | yes (macOS 26) | yes |
No single engine wins everywhere. Parakeet wins live English dictation on M-series Macs. The deep-dive comparison with numbers is in Parakeet vs Whisper vs Apple Speech on Mac.
Hardware requirements
You need an Apple Silicon Mac (M1, M2, M3, or M4). Intel Macs aren’t supported, because Parakeet relies on the Neural Engine for its speed. macOS 14 (Sonoma) is the minimum; on macOS 26 you also get the Apple Intelligence proofread layer as a bonus.
Disk-wise, plan for about 2.5 GB free for the model and runtime. 8 GB of RAM is enough for dictation, since the Neural Engine carries the model rather than the CPU and background apps don’t fight Parakeet for cycles. The built-in mic is fine; AirPods and USB mics work too.
No discrete GPU, no CUDA, no internet connection at runtime.
Privacy
Running Parakeet on your Mac means no audio leaves your machine. The microphone signal goes through the local model and lands in the active text field. That’s the whole path: no cloud round-trip, no transcript sitting on a server you don’t control, no API key tying usage back to an account.
This matters for healthcare, legal, and any work with regulated data. It also matters for everyday speed: the 80ms latency is unbeatable specifically because there’s no network in the loop.
Common questions before you install
Will Parakeet drain my battery? Less than you’d expect. The Neural Engine is the most power-efficient compute path on Apple Silicon. A full hour of continuous dictation typically costs a few percent of battery on an M2 MacBook Air.
Can I switch engines per task? Yes. Dictato lets you pick a default and switch in one click: Parakeet for English drafts, WhisperKit for a Polish email, Apple SpeechAnalyzer for macOS 26-only flows. You can also leave it on Auto and Dictato picks a sensible default based on language and context.
Does it work in any Mac app? Yes. Dictato types via the Accessibility API, so it lands text in Mail, Messages, Notes, Slack, Discord, Xcode, VS Code, Cursor, Notion, Linear, Figma, browsers, anywhere the cursor blinks.
What if I’m on Intel Mac? Parakeet won’t run with acceptable latency on Intel. Dictato falls back to WhisperKit there. If you’re on Intel and serious about local dictation, the real bottleneck is the Neural Engine, which only Apple Silicon has. Upgrading the Mac is the right move, not the app.
Ready to dictate with NVIDIA Parakeet on your Mac? Dictato ships Parakeet (plus WhisperKit, Apple SpeechAnalyzer, and Qwen3-ASR), runs 100% offline at 80ms latency, and types into any app. 7-day free trial, then 9.99€ for 2 years. No subscription. Try it free →