Reasoning models struggle to control their chains of thought, and that’s good

Reasoning models have difficulty controlling their chains of thought during problem solving. This limitation can actually be beneficial as it encourages diverse reasoning paths. Understanding this behavior helps improve model design and reliability in AI applications.

ArchiveMajor

Signal trust

Single sourceEarly signal

PublishedThursday, March 5, 2026 at 11:00 AMMar 5, 11:00 AM

FreshnessArchive

Story ID#60

Back to feed Original report

Original article excerpt

Server-side extracted preview paragraphs from the original source.

OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.

As AI agents become capable of carrying out increasingly complex and autonomous tasks, maintaining reliable oversight of their behavior becomes more important. Consistent with our principle of iterative deployment, we study how systems behave in real-world settings and continuously refine safeguards as capabilities advance. To support this, our safety approach⁠ uses defense-in-depth, with multiple complementary layers of defense such as safety training⁠, behavioral testing⁠, agentic code review⁠(opens in a new window), and chain-of-thought (CoT) monitoring⁠. CoT monitoring analyzes the reasoning steps agents generate while pursuing tasks. These reasoning traces can provide valuable signals during both training and deployment, helping monitoring systems identify when an agent’s behavior may be unsafe or inconsistent with the user’s intended goals.

Opening the briefing

Reasoning models struggle to control their chains of thought, and that’s good

Original article excerpt