OpenAI and Anthropic share findings from a joint safety evaluation

August 27, 2025 Steve

OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing one another’s fashions for misalignment, instruction following, hallucinations, jailbreaking, and extra—highlighting progress, challenges, and the worth of cross-lab collaboration.

You May Also Like

What’s In a Name?

Ethics-driven model auditing and bias mitigation

AI in finance: Addressing hurdles on the path to transformation