Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

JetBrains has launched Mellum2, a 12 billion parameter Mixture-of-Experts (MoE) model. This model aims to improve efficiency and scalability in large language models. Mellum2 is now available on Hugging Face for developers and researchers to use and experiment with.

NowLaunchHigh-signal source

Signal trust

High-signal sourceSingle sourceEarly signal

PublishedMonday, June 1, 2026 at 5:45 PMJun 1, 05:45 PM

Freshness2h live

Story ID#3700

Back to feed Original report

Original article excerpt

Server-side extracted preview paragraphs from the original source.

A Blog post by JetBrains on Hugging Face

Today we’re releasing Mellum2, an open Mixture-of-Experts model optimized for low-latency text-and-code workloads. Mellum originally started as a code completion model. With Mellum2, we extend that foundation to a broader set of natural language and software engineering tasks while keeping the model focused on efficient inference and deployability. Modern AI systems increasingly rely on multiple model calls: routing, retrieval, summarization, planning, validation, and tool use. Many of these operations are latency-sensitive and do not require the largest available model. Mellum2 targets these workloads.

Opening the briefing

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

Original article excerpt