Building a Virtual Microphone on macOS for STT + Video Conferencing
Ever wanted to pipe your microphone audio to a speech-to-text agent while simultaneously sending it to Zoom or Google Meet? You can — with a virtual microphone.
The Core Concept
macOS allows user-space audio drivers via Core Audio HAL (Hardware Abstraction Layer) plugins. These register as real audio devices that appear in System Preferences and any app's audio input picker. This is how apps like Loopback, BlackHole, and Soundflower work.
Architecture
The idea is simple: capture the real mic, fork the audio stream, and send one copy to your STT service and the other to a virtual mic device that your conferencing app reads from.
Real Mic → Your App → [fork]
├── STT Agent (websocket/API)
└── Virtual Mic Device → Zoom/Meet/etc
The Audio HAL Plugin
The key piece is a .driver bundle installed in /Library/Audio/Plug-Ins/HAL/. You implement Apple's AudioServerPlugIn protocol — the modern replacement for the deprecated AudioHardwarePlugIn. This is C-based and runs inside the coreaudiod process space.
A few things to know:
- The HAL plugin's IO callback runs on a real-time thread — no allocations, no locks, no blocking calls.
- You communicate between your app and the plugin via a shared ring buffer.
- Starting with macOS 13+, AudioDriverKit is Apple's preferred approach. It runs in userspace (no kext), is more sandboxed, and more future-proof.
The best open source reference is BlackHole — a minimal virtual audio device you can fork and modify.
The Fastest Path to a Prototype
You don't need to write a HAL plugin from scratch to get started. Here's the pragmatic approach:
- Install BlackHole (or fork it) as your virtual mic device
- Write a Swift app that captures the real mic via
AVAudioEngine - Route audio to both BlackHole's input and your STT service
- In Zoom/Meet, select "BlackHole" as the microphone
Swift Audio Routing
Your app captures the real microphone and forks the stream:
import AVFoundation
let engine = AVAudioEngine()
let inputNode = engine.inputNode
let format = inputNode.outputFormat(forBus: 0)
// Tap the real mic — called on the audio render thread
inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, time in
// Fork 1: Send to your STT agent
sendToSTTAgent(buffer: buffer)
// Fork 2: Write to BlackHole's ring buffer
writeToVirtualDevice(buffer: buffer)
}
try engine.start()
For the STT fork, you'd typically stream PCM chunks over a websocket to Whisper, Deepgram, or your own agent. For the virtual device fork, you write samples to BlackHole via its Core Audio input stream.
When to Build a Custom Driver
The BlackHole approach works great for prototyping, but you'd want a custom HAL plugin or AudioDriverKit driver if you need:
- A single self-contained app (no separate driver install)
- Custom device naming and configuration
- Tighter control over latency and buffer sizes
- Distribution through the Mac App Store (AudioDriverKit only)
Summary
macOS makes this possible through its Core Audio HAL plugin system. The virtual microphone pattern — capture, fork, route — is well-established and used by professional audio tools. Start with BlackHole to validate the concept, then graduate to a custom driver if needed.