FFmpeg 8.0 adds Whisper speech recognition
FFmpeg 8.0 integrates Whisper for AI speech-to-text. Multilingual, open-source, and efficient for transcription, subtitles, and voice apps.
"Top Python Libraries" Publication 400 Subscriptions 20% Discount Offer Link.
Whisper Integration into FFmpeg: AI-Powered Speech-to-Text Available Starting from Version 8.0.
Whisper is an automatic speech recognition (ASR) model developed by OpenAI that supports multilingual speech-to-text conversion and speech translation (such as directly converting speech in other languages to English text).
It's suitable for voice assistants, meeting transcription, subtitle generation, multilingual communication, and other scenarios. Due to its open-source nature and ease of use, it has been widely adopted by developers and enterprises in speech processing projects.
To be precise, what FFmpeg 8.0 integrates is the whisper.cpp open-source library: