Microsoft Open-Sources Production-Grade Speech AI

Microsoft open-sources VibeVoice, a 7B parameter speech AI with 60-minute audio processing, speaker diarization, and 50+ language support. ASR available, TTS removed due to abuse risks.

Mar 31, 2026

∙ Paid

VibeVoice: Microsoft's Open-Source Voice AI That Transcribes 60 Minutes in One Pass - YouTube

VibeVoice is Microsoft’s open-source, cutting-edge speech AI model. Its core capabilities are just two: Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). However, the TTS portion of the code has already been removed due to potential abuse risks. So what’s currently usable is the speech recognition part.

I’ll put the project link at the end. First, let me explain what this project can actually do.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.

Top Python Libraries

Microsoft Open-Sources Production-Grade Speech AI

Microsoft open-sources VibeVoice, a 7B parameter speech AI with 60-minute audio processing, speaker diarization, and 50+ language support. ASR available, TTS removed due to abuse risks.

Continue reading this post for free, courtesy of Meng Li.