Top Python Libraries

Top Python Libraries

Microsoft Open-Sources Production-Grade Speech AI

Microsoft open-sources VibeVoice, a 7B parameter speech AI with 60-minute audio processing, speaker diarization, and 50+ language support. ASR available, TTS removed due to abuse risks.

Meng Li's avatar
Meng Li
Mar 31, 2026
∙ Paid
VibeVoice: Microsoft's Open-Source Voice AI That Transcribes 60 Minutes in  One Pass - YouTube

VibeVoice is Microsoft’s open-source, cutting-edge speech AI model. Its core capabilities are just two: Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). However, the TTS portion of the code has already been removed due to potential abuse risks. So what’s currently usable is the speech recognition part.

I’ll put the project link at the end. First, let me explain what this project can actually do.

User's avatar

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.
© 2026 Meng Li · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture