Original article excerpt
Server-side extracted preview paragraphs from the original source.
How Databricks and NVIDIA are powering the next generation of enterprise AI and agentic applications
NVIDIA accelerated computing powers some of the most demanding AI workloads on Databricks, from large-scale training, fine-tuning, and inference to industry-specific AI solutions. Today at Data + AI Summit, we're highlighting how NVIDIA AI infrastructure lies at the center of new announcements from Databricks AI Runtime, Model Serving, and Industry AI solutions, including a look at how the new NVIDIA Vera CPU will power the next generation of agentic infrastructure.
Here's how Databricks and NVIDIA are building an AI platform together, from GPUs for training and inference, to purpose-built CPUs for the agentic era.
Databricks AI Runtime (AIR) brings NVIDIA GPU acceleration directly to data and AI teams, so they can train and fine-tune models on governed enterprise data without managing separate GPU infrastructure. With AIR, customers obtain the advanced NVIDIA hardware and networking, directly where their governed data is on Databricks:
Databricks Model Serving powers production inference for thousands of Databricks customers. At the core of Model Serving, NVIDIA hardware and software deliver the low-latency, high-throughput inference at scale our customers need, across frontier models like Qwen, GPT-OSS and custom neural networks our customers build. Additional serving capabilities include NVIDIA hardware and Triton Inference Server. Model Serving supports leading inference-optimized GPUs with Triton's advanced dynamic batching and optimized performance coming soon. With Model Serving, customers can serve the models they train on NVIDIA hardware directly on managed Databricks infrastructure.
The rise of autonomous agents introduces a new infrastructure challenge. While GPUs excel at model inference, the agent harness, tool calls, CPU-powered analytics and managing multi-step reasoning, all run on CPUs. Today's CPUs are often the bottleneck: latency in tool calling, communication overhead between agent steps, and inconsistent performance under load all degrade the agentic experience.
NVIDIA Vera is a next-generation CPU designed specifically for this workload. Engineered for three core use cases, agentic workloads, reinforcement learning, and CPU-based data analytics, Vera delivers:
