Company

METR

Overview

METR (formerly known as ARC Evals) is a non-profit organization dedicated to assessing whether frontier AI models possess 'autonomous capabilities' that could pose a threat to society. They develop benchmarks to measure how well AI can perform complex, multi-step tasks in the real world.

Mentioned Articles

5 件

テクノロジー
Anthropic大規模調査が暴くAIエージェントの現在地：自律性の拡大とソフトウェア開発への一極集中

生成AIの進化は、単純な対話型インターフェースから「自律型エージェント」への移行という新たな段階を迎えている。これまで理論やベンチマークの数値で語られることの多かったエージェントの自律性について、Anthropicが自社 […]

2026年2月20日 10 分で読める
テクノロジー
「何か大変なことが起きている」とAI企業CEOが警告：GPT-5.3が自らを構築し、知能爆発のループがついに回り始めた

2020年2月の世界を覚えているだろうか。中国の武漢で奇妙なウイルスが流行しているというニュースが流れ始めていたが、多くの人々はまだレストランで食事を楽しみ、出張の計画を立て、日常を疑っていなかった。「トイレットペーパー […]

2026年2月12日 8 分で読める
テクノロジー
Deloitteの調査でもAI導入による利益が期待外れであることが明らかに：74%の期待と20%の実績の間に横たわる「死の谷」とは

2026年、企業のAI投資は「夢」から「冷徹な現実」へとフェーズを移行させた。 Deloitteが発表した最新レポート『State of AI in the Enterprise 2026』は、世界のビジネスリーダーたち […]

2026年1月24日 9 分で読める
テクノロジー
OpenAI「Codex」がGPT-5で自己進化する「再帰的開発」へ突入──Soraアプリを18日で構築したAIエージェントの衝撃と真価

AI開発は既に決定的な変化を迎えているようだ。OpenAIの従業員がArs Technicaに明かしたところによれば、同社のAIコーディングツール「Codex」は、現在「その大部分がCodex自身によって構築されている」 […]

2025年12月13日
テクノロジー
AI利用で開発速度が19%低下という衝撃の結果：経験豊富な開発者ほど陥る「体感速度の幻想」

AIがコードを書き、開発者を支援する――。この数年、誰もが信じてきた「生産性革命」のシナリオに、冷や水を浴びせる研究結果が発表された。AIの能力評価を専門とする非営利研究機関METRが実施した厳密な調査によると、経験豊富 […]

2025年7月13日 14 分で読める

External Mentions

10 件

arXiv An SKA-Low RM Grid for constraining the origin of cosmic magnetism
▲ 0 Shane P. O'Sullivan 2026年6月23日
arXiv Improving Radio Source Count Estimation Using Kernel Density Estimation
▲ 0 Luozhenhan Liu 2026年6月23日
arXiv Micron-Scale Technosignatures: How a Cubic Metre of Lunar Regolith May Begin to Constrain the Number of Past Technological Civilisations in the Galaxy
▲ 0 Lewis J. Pinault 2026年6月23日
arXiv FAST Pulsar Database IV. Spike subpulses and quasi-periodic subpulses of 25 pulsars observed by FAST
▲ 0 Tao Wang 2026年6月22日
arXiv Discovering Crystal Structure Prediction Algorithms with an AI Co-Scientist
▲ 0 Kiyoung Seong 2026年6月22日
arXiv Discovery of a 24-millisecond pulsar in a very long orbit with the Murchison Widefield Array
▲ 0 Chia Min Tan 2026年6月17日
arXiv Polarisation and Faraday rotation measure imaging at metre wavelengths with sub-arcsecond resolution: a foundational calibration strategy
▲ 0 R. J. van Weeren 2026年6月16日
arXiv Energy-Efficient Arm Reaching for a Humanoid Robot via Deep Reinforcement Learning with Identified Power Models
▲ 0 Nestor N. Deniz 2026年6月14日
arXiv Direct nanoscale observation of melting and solute redistribution in a hypoeutectic Al-Cu alloy with in situ STEM
▲ 0 Martin Hasenburger 2026年6月10日
arXiv Surface Crevasse Evolution Observed Using Matched Field Processing and Source Relocation at Hansbreen, Svalbard
▲ 0 Wojciech Gajek 2026年6月9日

METR

Overview

Mentioned Articles

Anthropic大規模調査が暴くAIエージェントの現在地：自律性の拡大とソフトウェア開発への一極集中

「何か大変なことが起きている」とAI企業CEOが警告：GPT-5.3が自らを構築し、知能爆発のループがついに回り始めた

Deloitteの調査でもAI導入による利益が期待外れであることが明らかに：74%の期待と20%の実績の間に横たわる「死の谷」とは

OpenAI「Codex」がGPT-5で自己進化する「再帰的開発」へ突入──Soraアプリを18日で構築したAIエージェントの衝撃と真価

AI利用で開発速度が19%低下という衝撃の結果：経験豊富な開発者ほど陥る「体感速度の幻想」

External Mentions