Tag: Vision-Language Model

1 post found

2026 Complete Guide: How to Use GLM-OCR for Next-Gen Document Understanding

GLM-OCR is a 0.9B-parameter multimodal OCR model for complex document understanding. Learn how it achieves SOTA performance on OmniDocBench V1.5 (94.62) with structure-first outputs (Markdown, JSON, LaTeX), handling tables, formulas, handwriting across 100+ languages. Open Apache-2.0 weights, suitable for research, finance, legal, and developer workflows.

CurateClick Team

February 4, 2026

OCR

GLM-OCR

Document Understanding

2026

GLM

Vision-Language Model