PaperBench: Evaluating AI’s Ability to Replicate AI Research
We introduce PaperBench, a benchmark evaluating the power of AI brokers to replicate state-of-the-art AI analysis.
We introduce PaperBench, a benchmark evaluating the power of AI brokers to replicate state-of-the-art AI analysis.