NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents

NVIDIA has launched Nemotron 3 Nano Omni, a new AI model that unifies vision, audio, and language processing. This multimodal model improves efficiency by up to 9 times compared to separate models. It enables AI agents to respond faster and more intelligently by processing multiple data types simultaneously.

ArchiveLaunch

Signal trust

Single sourceEarly signal

Market reactionNVDA ↑ +1.72% by next close

Before $209.87After $213.49

PublishedTuesday, April 28, 2026 at 6:00 PMApr 28, 06:00 PM

FreshnessArchive

Story ID#1613

Back to feed Original report

Original article excerpt

Server-side extracted preview paragraphs from the original source.

Best-in-class open omni-modal reasoning model delivers the highest efficiency and accuracy to power agentic workflows such as computer use, document intelligence and audio-video reasoning.

AI agent systems today juggle separate models for vision, speech and language — losing time and context as they pass data from one model to the other.

Unveiled today, NVIDIA Nemotron 3 Nano Omni is an open multimodal model that brings these capabilities together into one system, enabling agents to deliver faster, smarter responses with advanced reasoning across video, audio, image and text. This best-in-class model gives enterprises and developers a production path for more efficient and accurate multimodal AI agents with full deployment flexibility and control.

Opening the briefing

NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents

Original article excerpt