OpenAI o1 System Card

OpenAI released the o1 System Card detailing the capabilities and safety features of their latest AI system. The card provides transparency on system design, performance, and risk mitigation strategies. This helps users understand the system's strengths and limitations, promoting responsible AI use.

ArchiveLaunch

Signal trust

Single sourceEarly signal

PublishedThursday, December 5, 2024 at 11:00 AMDec 5, 11:00 AM

FreshnessArchive

Story ID#484

Back to feed Original report

Original article excerpt

Server-side extracted preview paragraphs from the original source.

This report outlines the safety work carried out prior to releasing OpenAI o1 and o1-mini, including external red teaming and frontier risk evaluations according to our Preparedness Framework.

Only models with a post-mitigation score of "medium" or below can be deployed. Only models with a post-mitigation score of "high" or below can be developed further.

The o1 model series is trained with large-scale reinforcement learning to reason using chain-of-thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignmentA. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain-of-thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1‑mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.

Opening the briefing

OpenAI o1 System Card

Original article excerpt