ByteDance's Dolphin: 1.1K Stars for Doc-Parsing AI!

Dolphin by ByteDance: Open-source AI for fast, accurate document parsing. 2-stage VLM model with Hugging Face support.

Jun 01, 2025

∙ Paid

"Top Python Libraries" Publication 400 Subscriptions 20% Discount Offer Link.

Dolphin (Document Image Parsing via Heterogeneous Anchor Prompts) is an advanced open-source document image parsing model by ByteDance, designed to achieve efficient and accurate document parsing through a two-stage analysis-parsing paradigm.

Core Mechanism

Dolphin’s core lies in its innovative two-stage approach: the first stage conducts page-level layout analysis to generate a sequence of elements in natural reading order; the second stage performs parallel parsing of document elements using heterogeneous anchors and task-specific prompts. This method not only enhances parsing efficiency but also significantly improves accuracy.

Top Python Libraries

ByteDance's Dolphin: 1.1K Stars for Doc-Parsing AI!

Dolphin by ByteDance: Open-source AI for fast, accurate document parsing. 2-stage VLM model with Hugging Face support.

This post is for paid subscribers