AI Research
arXiv 論文トレンド
cs.AI / cs.LG / 宇宙 / エネルギーなど 8 分野の直近投稿
カテゴリ分布
- AI (9)
- Computer Vision (8)
- Quantum Physics (5)
- cs.CL (4)
- Machine Learning (4)
- Robotics (3)
- Materials (1)
- Energy Systems (1)
- stat.ME (1)
- cs.SD (1)
- physics.optics (1)
- stat.ML (1)
- math.OC (1)
論文一覧
- cs.CL Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning
Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowledge, yet conventional retrieval based on lexical or semantic similarity is poorly s
- Computer Vision InterleaveThinker: Reinforcing Agentic Interleaved Generation
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they canno
- Robotics Mana: Dexterous Manipulation of Articulated Tools
Articulated tool manipulation remains a major challenge in dexterous robotics due to the need to coordinate internal degrees of freedom and contact-rich interactions. While prior work has largely focu
- Robotics Improving Robotic Generalist Policies via Flow Reversal Steering
Generalist policies can learn a wide range of skills from diverse robot datasets. In order to solve or improve on challenging news tasks, we need a way to infer and invoke the appropriate actions from
- Computer Vision Modality Forcing for Scalable Spatial Generation
Text-to-image (T2I) models contain rich spatial priors. Synthesizing photorealistic, cluttered scenes requires an understanding of geometry, including perspective and relative scale. Prior works adapt
- Computer Vision RepWAM: World Action Modeling with Representation Visual-Action Tokenizers
This work presents RepWAM, a representation-centric world action model (WAM) built on representation visual-action tokenizers. Existing WAMs typically inherit reconstruction-oriented video tokenizers
- Computer Vision SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning
Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attemp
- Robotics $\texttt{WEAVER}$, Better, Faster, Longer: An Effective World Model for Robotic Manipulation
The potential impacts of world models (WMs, i.e., learned simulators) on robotics are far-reaching -- policy evaluation, policy improvement, and test-time planning -- all with limited real-world inter
- Machine Learning Understanding Truncated Positional Encodings for Graph Neural Networks
Positional encodings (PEs) enhance the power of graph neural networks (GNNs), both theoretically and empirically. Two of the most popular families of PEs - spectral (e.g., Laplacian eigenspaces, effec
- AI Automated reproducibility assessments in the social and behavioral sciences using large language models
Reproducibility in the social and behavioral sciences is typically evaluated by independent researchers who reanalyze the original data to assess whether the published findings can be recovered. Howev
- AI Agents-K1: Towards Agent-native Knowledge Orchestration
Current LLM-based research agents have advanced through agent orchestration, yet largely overlook scientific knowledge orchestration. Existing works often reduce papers to abstracts, surface mentions,
- Quantum Physics Semi-Device-Independent Certification for Nonlocality without Entanglement
In this work, we investigate maximum-confidence discrimination, which encompasses minimum-error and unambiguous discrimination, for ensembles of separable states by considering global and separable me
- Materials Morphology control and low-temperature magnetotransport in chiral 2D perovskite R-(MBA)$_2$PbI$_4$
Two-dimensional chiral hybrid perovskites, such as R/S-(MBA)\textsubscript{2}PbI\textsubscript{4}, are leading candidates for realizing and studying chirality-dependent charge and spin transport. Howe
- AI EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific soluti
- AI Before You Think: System 0, AI-Mediated Cognition and Cognitive Colonization
This paper examines three recent frameworks for understanding the cognitive and epistemic consequences of artificial intelligence: Tri-System Theory, Thinkframes, and System 0. It argues that while th
- Machine Learning Dense Supervision, Sparse Updates: On the Sparsity and Geometry of On-Policy Distillation
On-policy distillation (\textsc{OPD}) has recently become a prominent post-training recipe as it combines two desirable ingredients: on-policy student trajectories and dense teacher supervision, yet h
- Computer Vision Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction
We present Flex4DHuman, a multi-view video diffusion model that transforms a monocular or sparse multi-view video of a dynamic subject into synchronized dense multi-view videos using only relative cam
- Computer Vision World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible
Image-to-3D methods often trade off faithfulness and completeness: depth estimators are anchored to input pixels but stop at the visible surface, while image-to-3D models generate complete shapes that
- Quantum Physics To Cool, or Not to Cool? Displacement Sensing with Hot Quantum States
Quantum-enhanced displacement sensing with bosonic systems is typically formulated assuming that the oscillator is cooled close to its ground state before nonclassical probe preparation. We investigat
- cs.CL Operadic consistency: a label-free signal for compositional reasoning failures in LLMs
Detecting LLM reasoning failures at inference time without ground-truth labels has motivated a wide range of confidence baselines, including self-consistency, semantic entropy, and P(True), built on w
- cs.CL SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation
We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprising 31 datasets across 7 task types -- nearly 4$\times$ the dep
- Computer Vision Surflo: Consistent 3D Surface Flow Model with Global State
Geometry is invariant to viewpoint, which makes any collection of images a redundant encoding of a single 3D state. Existing feed-forward reconstruction models fail to exploit this: per-view methods e
- Quantum Physics Generalized two-qubit Hamiltonian for Projective Quantum Feature Maps
Projected quantum feature maps provide a strategy for using quantum processors as feature generators for classical machine-learning models. Building on counterdiabatic Ising-glass and one-dimensional
- Quantum Physics Optimal classical shadow estimation of unitary channels at Heisenberg limit
Full tomography of an unknown quantum evolution is resource-intensive and often unnecessary when the goal is only to predict selected properties. This motivates the study of classical shadow estimatio
- Machine Learning The Stable Recovery Manifold: Geometric Principles Governing Recoverability in Continual Learning
Catastrophic forgetting is often viewed as the destruction of previously learned knowledge during sequential learning. Building on the Accessibility Collapse framework, we investigate the geometric st
- Energy Systems Aerial Wildfire Suppression Planning with a Hybrid CNN-Cellular Automata Fire Model
Aerial wildfire suppression requires not only predicting fire spread, but also designing effective intervention strategies under operational and environmental uncertainty. We present a modeling and op
- stat.ME Valid Inference with Synthetic Data via Task Exchangeability
There is a proliferation of work arguing for the use of synthetic data in scientific research. For example, social scientists are arguing for the use of LLM-generated "silicon samples" in pilot studie
- cs.SD Generative Modeling of Bach-Style Symbolic Music: A Comparative Study of Autoregressive, Latent-Variable, and Adversarial Approaches
We study generative modeling of Bach-style symbolic piano music using a shared MIDI corpus and three model families: autoregressive LSTMs with attention, latent-variable models including recurrent VAE
- Computer Vision Revisiting Vehicle Color Recognition in Long-Tailed Surveillance Scenarios
Vehicle color recognition is an important cue for vehicle identification in surveillance systems, especially when license plates are illegible due to low resolution, occlusion, motion blur, or poor il
- physics.optics Electric-Field Mapping of Optically Perturbed CdTe Radiation Detectors
In radiation detectors, the spatial distribution of the electric field plays a fundamental role in their operation. Access to this field distribution is of strategic importance, especially when invest
- AI Beyond Runtime Enforcement: Shield Synthesis as Defensibility Analysis for Adversarial Networks
Shielded reinforcement learning is typically presented as a runtime safety mechanism that compiles temporal-logic specifications into automata restricting an agent's actions. We argue this is the wron
- stat.ML Majority-of-Three is Optimal
We give a short proof that the majority vote of three independent consistent classifiers is an optimal learner in the realizable PAC setting. This proves optimality for the simplest voting scheme, whi
- cs.CL One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders
Search-augmented LLMs increasingly mediate everyday consumer recommendations by retrieving live web content. This creates a new risk: generative recommenders may consume polluted web content, such as
- AI AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility
Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production
- AI Reasoning as Pattern Matching: Shared Mechanisms in Human and LLM Everyday Reasoning
When large language models (LLMs) fail to generalize or make haphazard errors in reasoning, it is often taken as evidence that LLMs are not truly reasoning, but rather performing a kind of pattern mat
- Quantum Physics Diffusive Dynamics of Nonstabilizerness
Symmetries shape the quantum-information dynamics of many-body systems, but their effect on nonstabilizerness, the resource complementary to entanglement, is less understood. We compute the stabilizer
- math.OC Distribution-Agnostic Robust Trajectory Optimization via Chance-Constrained Reinforcement Learning
This paper presents a distribution-agnostic robust trajectory-optimization framework based on chance-constrained reinforcement learning. The uncertainty is represented here through initial conditions
- AI Multi-Agent Reinforcement Learning from Delayed Marketplace Feedback for Objective-Weight Adaptation in Three-Sided Dispatch
Dispatch in three-sided marketplaces provides a natural setting for reinforcement learning from world feedback: decisions are evaluated by delayed operational outcomes such as delivery speed, courier
- Machine Learning Beyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning Models
Chain-of-thought (CoT) reasoning is the dominant paradigm for inference-time scaling in language models, yet the causal influence of individual steps on the final answer poorly understood. We estimate
- AI EpiBench: Verifiable Evaluation of AI Agents on Epigenomics Analysis
We introduce EpiBench, a verifiable benchmark for short-horizon epigenomics analysis. EpiBench evaluates whether agents can make well-defined analysis decisions from realistic workflow states and retu