OpenAI and Anthropic share findings from a joint safety evaluation
OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing one another’s fashions for misalignment, instruction following, hallucinations, jailbreaking, and extra—highlighting progress, challenges, and the worth of cross-lab collaboration.
