notesum.ai

Published at November 7

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

cs.AI

Released Date: November 7, 2024

Authors: Elliot Glazer1, Ege Erdil1, Tamay Besiroglu1, Diego Chicharro2, Evan Chen3, Alex Gunning4, Caroline Falkman Olsson1, Jean-Stanislas Denain1, Anson Ho1, Emily de Oliveira Santos5, Olli Järviniemi, Matthew Barnett1, Robert Sandler1, Jaime Sevilla1, Qiuyu Ren6, Elizabeth Pratt6, Lionel Levine7, Grant Barkley8, Natalie Stewart8, Bogdan Grechuk9, Tetiana Grechuk9, Shreepranav Varma Enugandla6

Aff.: 1Epoch AI; 2King's College London; 3MIT; 4University of Siegen; 5ICMC, USP; 6UC Berkeley; 7Cornell University; 8Harvard University; 9University of Leicester

Arxiv: http://arxiv.org/abs/2411.04872v1