notesum.ai

Published at November 7

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

cs.AI

Released Date: November 7, 2024

Authors: Elliot Glazer¹, Ege Erdil¹, Tamay Besiroglu¹, Diego Chicharro², Evan Chen³, Alex Gunning⁴, Caroline Falkman Olsson¹, Jean-Stanislas Denain¹, Anson Ho¹, Emily de Oliveira Santos⁵, Olli Järviniemi, Matthew Barnett¹, Robert Sandler¹, Jaime Sevilla¹, Qiuyu Ren⁶, Elizabeth Pratt⁶, Lionel Levine⁷, Grant Barkley⁸, Natalie Stewart⁸, Bogdan Grechuk⁹, Tetiana Grechuk⁹, Shreepranav Varma Enugandla⁶

Aff.: ¹Epoch AI; ²King's College London; ³MIT; ⁴University of Siegen; ⁵ICMC, USP; ⁶UC Berkeley; ⁷Cornell University; ⁸Harvard University; ⁹University of Leicester

Arxiv: http://arxiv.org/abs/2411.04872v1

MSC Classification	Percentage	MSC Classification	Percentage
11 Number theory	17.8%	57 Manifolds and cell complexes	2.1%
05 Combinatorics	15.8%	13 Commutative algebra	2.1%
20 Group theory and generalizations	8.9%	54 General topology	1.4%
60 Probability theory and stochastic processes	5.1%	35 Partial differential equations	1.4%
15 Linear and multilinear algebra; matrix theory	4.8%	53 Differential geometry	1.4%
14 Algebraic geometry	4.8%	42 Harmonic analysis on Euclidean spaces	1.4%
33 Special functions	4.8%	41 Approximations and expansions	1.4%
55 Algebraic topology	3.1%	52 Convex and discrete geometry	1.4%
12 Field theory and polynomials	2.4%	82 Statistical mechanics, structure of matter	1.0%
30 Functions of a complex variable	2.4%	44 Integral transforms, operational calculus	1.0%
68 Computer science	2.4%	17 Nonassociative rings and algebras	1.0%
18 Category theory; homological algebra	2.4%	Other (< 3 problems each)	9.9%