notesum.ai

Published at December 4

U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs

cs.CL
cs.AI

Released Date: December 4, 2024

Authors: Konstantin Chernyshev1, Vitaliy Polshkov1, Ekaterina Artemova1, Alex Myasnikov2, Vlad Stepanov2, Alexei Miasnikov3, Sergei Tilga1

Aff.: 1Toloka AI; 2Gradarius; 3Gradarius, Stevens Institute of Technology

Arxiv: http://arxiv.org/pdf/2412.03205v1