Back to Blog
Tag: Vision-Language Model
1 post found
GLM-OCR is a 0.9B-parameter multimodal OCR model for complex document understanding. Learn how it achieves SOTA performance on OmniDocBench V1.5 (94.62) with structure-first outputs (Markdown, JSON, LaTeX), handling tables, formulas, handwriting across 100+ languages. Open Apache-2.0 weights, suitable for research, finance, legal, and developer workflows.
CurateClick Team
February 4, 2026