Original article excerpt
Server-side extracted preview paragraphs from the original source.
A Blog post by IBM Granite on Hugging Face
The model ships as a LoRA adapter on top of Granite 4.0 Micro, our dense language model, keeping vision and language modular for text-only fallbacks and seamless integration into mixed pipelines. It continues to support vision-language tasks such as producing detailed natural-language descriptions from images (e.g., “Describe this image in detail”). The model can be used standalone or in tandem with Docling to enhance document processing pipelines with deep visual understanding capabilities.