Original article excerpt
Server-side extracted preview paragraphs from the original source.
Building reliable LLM inference infrastructure for our enterprise customers requires innovations in load balancing, inference resilience, and performance optimizations
At Databricks, we’ve built a unique inference platform that serves every frontier model, from open source models like Kimi and Qwen to proprietary models like OpenAI, Gemini, and Claude. We power inference for some of the largest agentic applications in the world, including Superhuman, Yipit Data, Fox Sports, and others. Today, we serve more than 120T tokens per month.
