Term

AIME

別名: American Invitational Mathematics Examination, AIME

Overview

AIME（American Invitational Mathematics Examination）は、米国数学オリンピックへの予選を兼ねた難易度の高い数学試験です。AIモデルの性能評価において、論理的思考や多段階の推論が必要な数学問題を解く能力を測定するベンチマークとして広く活用されています。

Mentioned Articles

2 件

テクノロジー
OpenAI、GPT-5.5 Instantを発表：誤答52.5%減のChatGPT新既定モデル

OpenAIはChatGPTのデフォルトモデルをGPT-5.5 Instantに更新し、医療・法律・金融の高リスク質問におけるハルシネーションを52.5%削減したと発表した。このモデルは、AIME数学テストのスコア向上や回答の簡潔化も実現し、業務利用への拡大を後押しする。

2026年5月6日 6 分で読める
テクノロジー
Anthropic、「世界最高のコーディングモデル」Claude Sonnet 4.5を発表：30時間の自律作業とSWE-bench 82%達成の衝撃

AI企業Anthropicは9月30日、最新モデルClaude Sonnet 4.5を発表した。同社は「世界最高のコーディングモデル」と明言し、複雑なエージェント構築とコンピューター操作において最強のモデルであると位置づ […]

2025年9月30日

External Mentions

10 件

arXiv The formation radius of HCN in O-rich asymptotic giant branch stars
▲ 0 L. Marinho 2026年6月17日
arXiv Nanoscale memristive devices: Threats and solutions
▲ 0 Amir M. Hajisadeghi 2026年6月17日
arXiv The Array Control and Data Acquisition software of the Cherenkov Telescope Array Observatory
▲ 0 I. Oya 2026年6月17日
arXiv PAHSPECS: Polycyclic aromatic hydrocarbon properties at cosmic noon with JWST/MIRI MRS
▲ 0 Cristina Maria Lofaro 2026年6月16日
arXiv Conformal Prediction Intervals with Tail-Specific Guarantees
▲ 0 Simone Cuonzo 2026年6月16日
arXiv Reconstruction for an inverse scattering problem with a Kerr type nonlinearity
▲ 0 Khaoula El Maddah 2026年6月11日
arXiv Connecting Polarization to Exoplanet Yield Calculations for HWO
▲ 0 Jaren N. Ashcraft 2026年6月11日
arXiv CAPOS: The bulge Cluster APOgee Survey XII. Abundances for 98 PIGS metal-poor Bulge field giants
▲ 0 Carolina Salgado 2026年6月10日
arXiv Magnetic fields at the dawn of structure formation I. The CARLA J1510+5958 proto-cluster
▲ 0 A. Pagliotta 2026年6月10日
arXiv Scaling-optimal purification of noisy qubit unitary channels
▲ 0 Ryotaro Niwa 2026年6月10日