Top Python Libraries

Top Python Libraries

DeepSeek V4 Flash Quantized Version Goes Viral

DeepSeek V4 Flash quantized GGUF model by Redis creator antirez, with ds4 inference engine for Mac & CUDA.

Meng Li's avatar
Meng Li
May 18, 2026
∙ Paid
How DeepSeek V4 Flash Competes With ChatGPT and Gemini - Geeky Gadgets

Recently, a quantized version of DeepSeek V4 Flash on Hugging Face suddenly went viral — and what’s even wilder is that the author is antirez (Salvatore Sanfilippo, the creator of Redis).

I checked the Hugging Face model repo, and the download count has already exceeded 260,000.

Long-time Redis users seeing this name might do a double-take: Why did he suddenly start working on a large language model inference engine?

Background

Here’s what happened: Antirez open-sourced two tightly integrated projects at the same time:

  • A custom quantized GGUF version of DeepSeek V4 Flash, hosted at: huggingface.co/antirez/deepseek-v4-gguf

  • DwarfStar 4 (short: ds4) — an inference engine built specifically for DeepSeek V4 Flash, hosted at: github.com/antirez/ds4

User's avatar

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.
© 2026 Meng Li · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture