Loading the article brief, supporting context, and related editorial blocks.
Response to NIST Executive Order on AI | AI BriefWire
AI BriefWire / Briefing
OpenAI NewsPolicyPolicy & DealsHeat 63
Response to NIST Executive Order on AI
OpenAI responded to the NIST Executive Order on AI, emphasizing the importance of safety and transparency in AI development. They support collaborative efforts to create standards that ensure trustworthy AI systems. This response highlights the growing focus on regulation and responsible AI governance.
Server-side extracted preview paragraphs from the original source.
Original article excerpt
The National Institute of Standards and Technology (NIST) request for information related to its assignments under sections 4.1, 4.5, and 11 of the Executive Order Concerning Artificial Intelligence
The National Institute of Standards and Technology (NIST) request for information related to its assignments under sections 4.1, 4.5, and 11 of the Executive Order Concerning Artificial Intelligence.
OpenAI was created as a nonprofit in 2015 to ensure that artificial general intelligence—in short, AI that’s at least as smart as a person—benefits all of humanity. We research, develop, and release cutting-edge AI technology as well as tools and best practices for the safety, alignment, and governance of AI. We welcome this opportunity to inform NIST’s ongoing and critical work on AI.
Here, we focus on three topics raised in the RFI: (1) evaluating and auditing AI capabilities, (2) conducting red teaming tests to enable deployment of safe, secure, and trustworthy systems, and (3) synthetic media and provenance.
We applaud NIST’s focus on “creating guidance and benchmarks for evaluating capabilities... through which AI could cause harm.” OpenAI has committed to a Preparedness Framework(opens in a new window), a comprehensive approach to evaluate, track, and mitigate catastrophically dangerous risks from current and future AI models. The Preparedness Framework currently tracks four initial areas of risk: cybersecurity; chemical, biological, nuclear, and radiological threats (CBRN); persuasion; and model autonomy. The Framework also commits us to ongoing vigilance for “unknown unknown” risks that have not yet been identified.As part of this work, OpenAI recently shared one large-scale evaluation for CBRN: assessing GPT‑4’s ability to meaningfully increase malicious actors’ access to dangerous information about biological threat creation, compared to the baseline of existing resources (i.e., the internet). In the largest-of-its-kind evaluation involving both biology experts and students, we found that GPT‑4 provides at most a mild uplift in biological threat creation information. While not a large enough uplift to be conclusive, we hope this finding serves as a starting point for continued research and community deliberation, which we hope will be driven by NIST and the new AI Safety Institute.This work increased our confidence in several key principles for evaluating risks from AI systems:
Additional information is available in our blog post on the recent biorisk study: Building an early warning system for LLM-aided biological threat creation.
OpenAI defines red teaming as “a structured process for probing AI systems and products for the identification of harmful capabilities, outputs, or infrastructural threats.”A There are various possible methods emerging under the umbrella term of red teaming, including internal red teaming (done by internal, dedicated teams at a lab or company), external red teaming (done by external stakeholders in collaboration with a lab or company), or automated red teaming (using AI models to generate automated attacks and classifying outputs). In the context of this document, we are primarily referring to external red teaming efforts which involve OpenAI working with external domain experts to assess the capabilities and risks of an AI model or system. OpenAI’s approach to red teaming does not consider adversarial attacks or model outputs in isolation. Rather, it is a method for eliciting risks in a contextualized, holistic manner in collaboration with domain experts.B In addition to malicious use and methods to circumvent safety mitigations, red teaming also considers other risks: benign or expected inputs leading to harmful or risky outputs, novel capabilities improvements that may alter the risk landscape, and how factors outside of the system itself may interact with model outputs to cause risks or harms. Assessments of these areas often benefit from having humans in the loop to generate potential examples, and to validate the resulting outputs in the context of a given red teamer’s expertise.