Term

MATH

別名: MATH benchmark

Overview

最終更新: 2026年7月9日

MATHは、数学オリンピックやAMC（アメリカ数学競技会）などの問題を含む、中高生レベルの難易度の高い数学問題を集めたデータセットである。代数、幾何、数論など多岐にわたる分野をカバーしており、AIの数学的推論能力を測る標準的な指標として広く利用されてきた。しかし、学習データへの混入（データ汚染）により、実際の能力以上に高いスコアが出ている可能性が議論されている。

Mentioned Articles

1 件

テクノロジー
最先端AIの実際の数学能力はそこまで高くない？新たなFrontierMathベンチマークでは2%未満の解答率となり、AGIへの課題が鮮明に
人工知能（AI)の進化が加速度的な発展を遂げ、画像生成や自然言語処理で人間の能力に迫る成果を上げる中、その限界を鮮明に示す新たな指標が登場した。AI研究機関Epoch AIが開発した高度な数学ベンチマークテスト「Fron […]
2024年11月12日約 6 分

External Mentions

10 件

arXivA Probabilistic Sign Rule for Quotients of Positive Series and Integral Transforms
▲ 0Zakaria Derbazi2026年7月2日
arXivThe structure of FAC posets and the Aharoni--Korman conjecture
▲ 0Lawrence Hollom2026年7月2日
arXivStability of global self-similar solutions to the cubic wave equation and the wave maps equation
▲ 0Akansha Sanwal2026年7月2日
arXivCut-off Jastrow Factors and Spectral Barron Regularity of Coulombic Electronic Wave Functions
▲ 0Virginie Ehrlacher2026年7月2日
arXivAlmost Supermartingale Extensions of Olivier's Theorem
▲ 0Patrick L. Combettes2026年7月2日
arXiv$\mathrm{W}^*$-algebraic Integration Theory
▲ 0Jan Głowacki2026年6月25日
arXivThe Effect of Topological Defects and Magnetic Flux on Fully-Heavy Tetraquarks and Mass Spectra of Heavy Quarkonia Using the Analytical Exact Iteration Method
▲ 0N. H. Gerish2026年6月25日
arXivHall Geometry and Auslander-Reiten Quiver
▲ 0Aayush Verma2026年6月25日
arXivTypical distances in high-genus triangulations
▲ 0Tanguy Lions2026年6月25日
arXivError-Conditioned Neural Solvers
▲ 0Haina Jiang2026年6月25日

MATH

Overview

Mentioned Articles

最先端AIの実際の数学能力はそこまで高くない？新たなFrontierMathベンチマークでは2%未満の解答率となり、AGIへの課題が鮮明に

External Mentions