LightOnOCR-2-1B: 1B Model at <$0.01/1k Pages

1B OCR model beats 9B rivals at <$0.01 per 1k pages, outputs clean Markdown

Jan 23, 2026

∙ Paid

“Top Python Libraries” Publication 400 Subscriptions 20% Discount Offer Link.

The OCR (Optical Character Recognition) field has welcomed an extremely hardcore “small powerhouse” in recent days.

For a long time, in pursuit of ultimate OCR performance (particularly for handling complex mathematical formulas, multi-column layouts, and tables), we’ve often had no choice but to use massive multimodal models. The results were good, but the inference costs and speed also increased significantly.

However, LightOnAI’s newly released end-to-end OCR model LightOnOCR-2-1B has completely broken this impasse.

It is the flagship OCR model of the LightOnOCR series, claiming to be the best OCR model in the series.

Despite having only 1B parameters, it dominates models 9 times its size across various benchmark tests. The processing cost is less than $0.01 per thousand pages, and the speed is incredibly fast.

Its core logic is very straightforward: input a PDF or image, and it directly outputs perfectly formatted Markdown text.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.