DeepSeek V4 Flash Quantized Version Goes Viral
DeepSeek V4 Flash quantized GGUF model by Redis creator antirez, with ds4 inference engine for Mac & CUDA.
Recently, a quantized version of DeepSeek V4 Flash on Hugging Face suddenly went viral — and what’s even wilder is that the author is antirez (Salvatore Sanfilippo, the creator of Redis).
I checked the Hugging Face model repo, and the download count has already exceeded 260,000.
Long-time Redis users seeing this name might do a double-take: Why did he suddenly start working on a large language model inference engine?
Background
Here’s what happened: Antirez open-sourced two tightly integrated projects at the same time:
A custom quantized GGUF version of DeepSeek V4 Flash, hosted at: huggingface.co/antirez/deepseek-v4-gguf
DwarfStar 4 (short: ds4) — an inference engine built specifically for DeepSeek V4 Flash, hosted at: github.com/antirez/ds4



