Top Python Libraries

Top Python Libraries

Real-Time Offline Translation for Live Streaming

Stream 200 languages live with sub 0.15s latency using NLLW SimulMT

Meng Li's avatar
Meng Li
Jan 19, 2026
∙ Paid

“Top Python Libraries” Publication 400 Subscriptions 20% Discount Offer Link.


When doing live streaming, real-time translation, or simultaneous interpretation, traditional offline translation models have to wait until the entire sentence is finished before starting translation, resulting in noticeable latency.

Recently, I discovered an open-source project on GitHub called NoLanguageLeftWaiting, which transforms Meta’s NLLB offline translation model into a real-time simultaneous interpretation model.

It can translate while listening, without waiting for complete sentences, solving issues in traditional models such as inconsistent punctuation insertion and messy prefix handling.

It supports mutual translation between 200 languages, offers two backend options (HuggingFace and Ctranslate2), and includes built-in model sizes of 600M and 1.3B.

The project is currently developing speculative decoding functionality, using a partial verification mechanism to further improve translation speed. Early tests show the verification process takes only about 0.15 seconds.

If you’re working on low-latency scenarios such as speech translation, live subtitles, or cross-language meetings, this project is definitely worth trying.

User's avatar

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.
© 2026 Meng Li · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture