Autor Technologies developed and deployed vox-bench, a TypeScript CLI tool to benchmark and monitor latency at each stage of their voice AI pipeline used for healthcare-related live phone calls (dental appointment bookings, patient intake, after-hours triage). The tool measures per-stage latency percentiles (p50, p95, p99), simulates full conversational round-trips, compares multiple providers, detects regressions, and models realistic conversation patterns. This enabled them to identify latency spikes (e.g., TTS provider delays) that average latency metrics missed, improving caller experience by preventing conversational chaos caused by delayed AI responses. They use vox-bench in production with scheduled runs and alerts, achieving a mature, streaming-enabled pipeline with latency mostly under thresholds that impact user experience.
Use Case
Opening the operator briefing
Pulling the full operator breakdown, tooling context, and verification notes.
