WhisperLiveKit: Real-Time, Local Speech-to-Text

Real-time speech-to-text with speaker diarization. 100% offline, open-source Python library for privacy-focused transcription.

Sep 26, 2025

∙ Paid

“Top Python Libraries” Publication 400 Subscriptions 20% Discount Offer Link.

GitHub - QuentinFuxa/WhisperLiveKit: Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.

Have you ever wished you could convert speech to text in real-time during meetings, lectures, or interviews while accurately distinguishing between different speakers? Now, all of this can be achieved in a completely offline environment.

WhisperLiveKit is an open-source Python library that combines cutting-edge speech recognition technology with speaker diarization capabilities, providing a low-latency, high-accuracy real-time speech-to-text solution where all processing is completed locally without relying on cloud services.

WhisperLiveKit is a Python-based open-source toolkit specifically designed for real-time Speech-to-Text (STT) and Speaker Diarization.

It employs advanced streaming processing algorithms that can receive audio input while simultaneously generating real-time text transcription results and distinguishing between different speakers.

Its most prominent feature is completely localized processing. Users’ speech data doesn’t need to be uploaded to the cloud, which not only significantly reduces latency but also effectively protects privacy and security, making it ideal for handling sensitive information.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.