Original article excerpt
Server-side extracted preview paragraphs from the original source.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
The hf CLI has been primarily built for our users over the years. But it's now increasingly used by coding agents: Claude Code, Codex, Cursor and more. So we rebuilt it to make it work for both audiences at once. This blog post summarizes what we did, and how we benchmarked it. We found that on complex, multi-step tasks the no-CLI baseline (an agent hand-rolling curl or the Python SDK) uses up to 6× as many tokens as the hf CLI.
We started tracking agent usage of the Hub in April 2026. The hf CLI (and the huggingface_hub Python SDK it's built on) detects when a coding agent is driving it by reading the environment variables agents set: CLAUDECODE/CLAUDE_CODE for Claude Code, CODEX_SANDBOX for Codex, plus Cursor, Gemini, Pi, and the universal AI_AGENT. That single signal does two jobs: it shapes the CLI's output (more on that below) and it tags each Hub request with an agent/<name> user-agent, so we can attribute traffic to the agent driving it. The two largest by distinct users are Claude Code and Codex, well ahead of everything else, and they're the two agents we benchmark later in this article.
The bars count distinct users per agent; request volume is the sub-label. Claude Code alone is ~40k users and nearly 49M requests, with Codex close behind. These are early numbers (we only began attributing agent traffic in April 2026), but the scale is already significant, and we expect it to keep growing as coding agents become a standard way to work with the Hub.
Humans and coding agents expect different outputs for the same hf commands. A human wants rich terminal output: ANSI color, padded tables truncated to fit the screen, a green ✅ on success, ✔ for booleans, progress bars, prose hints. An agent wants the inverse: no ANSI, nothing truncated, every value in full since an agent can handle far denser output than a human, kept compact and structured to stay light on tokens. It also can't answer a CLI prompt and will happily re-run a command after a timeout. The rest of this section is how hf gives each side what it needs. We introduced agent-mode output in hf v1.9.0 and have been migrating the rest of the CLI to it gradually in the following releases.
When hf auto-detects agent use (via the environment variables mentioned above), it renders the same command differently. It optimizes output format for humans or agents without passing a flag:
A human gets an aligned table, truncated to fit the terminal, plus a hint on how to see more, with color cues for status (a green ✓ on success, red on error). An agent gets the complete record as TSV: full repo ids, full ISO timestamps, every tag, no ANSI codes, nothing truncated, clean to parse and light on tokens.