Original article excerpt
Server-side extracted preview paragraphs from the original source.
We've developed and applied a new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.
We've developed and applied a new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.
Our research shows that Rule-Based Rewards (RBRs) significantly enhance the safety of our AI systems, making them safer and more reliable for people and developers to use every day. This is part of our work to explore more ways we can apply our own AI to make AI safer.