Original article excerpt
Server-side extracted preview paragraphs from the original source.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
📉 The gap is real and it is large: across all submitted models, far-field WER at low SNR is consistently several times higher than near-field WER on the same speech content
🔬 Methodology you can trust: hybrid wave-based simulation, sim-to-real validation, moving-source splits in beta, held-out audio, and standardized evaluation hardware across all submissions
⚡ Accuracy and speed together: the Pareto front plots average WER against RTFx so you can evaluate the tradeoff that is right for your deployment
👀 More is coming: multi-talker scenarios, microphone array support, and echo cancellation are on the roadmap
The gap between benchmark performance and real-world deployment is one of the more persistent frustrations in ASR development. Models that score well on standard evaluations often behave differently once real room acoustics are involved: reverberation, background noise, microphone distance. The complex interactions between these factors affect performance in ways that clean-speech benchmarks do not capture. The FFASR Leaderboard is our attempt to quantify that gap.
Treble Technologies and Hugging Face are launching the Far-Field ASR (FFASR) Leaderboard, the first open, community-driven benchmark designed to evaluate ASR models under realistic far-field acoustic conditions. It is live now, and we are inviting the community to submit models, explore the results, and help shape what comes next.