Updated on 2026.06.11

Usage instructions: here

Sim2Real

Publish Date Title & Abstract Authors Links
2026-06-09 A Practical Recipe Towards Improving Sim-and-Real Correlation for VLA Evaluation VLA Sim2Real
Simulation has become an essential tool for evaluating and improving vision-language-action (VLA) policies, offering scalable, reproducible, and controllable alternatives to costly real-world robot evaluation. Recent simulation benchmarks have made substantial progress on realism and diversity, yet these platforms have not been widely adopted as reliable proxies for real-world policy evaluation….
Yang Gao Team ArXiv
2026-06-09 Nonequilibrium Green Functions Simulations for Large Correlated Systems Sim2Real
Correlated real-time dynamics in large, spatially inhomogeneous quantum systems remain difficult to access with nonequilibrium many-body methods. Two-time nonequilibrium Green functions (NEGF) retain dynamical correlations but their computational runtime grows cubically with the number of time steps $N_\mathrm{t}$. This scaling bottleneck could recently be overcome by introducing the G1–G2…
Jan-Philip Joost Team ArXiv
2026-06-09 Updating the PATH framework with FRB host galaxy models Sim2Real
Over a hundred fast radio burst (FRB) host galaxies have now been identified, enabling both comparisons of host redshift with FRB dispersion measure to study the cosmological distribution of ionised gas, and analyses of host properties in order to identify FRB progenitors. The standard method for determining the most likely FRB host galaxy in an optical image is the Bayesian framework…
S. D. Ryder Team ArXiv
2026-06-09 A Comprehensive Inference-Time Augmentation Framework in Physiological Signals: Application to PPG-Based AF Detection Sim2Real
Objective: Accurate classification of physiological signals in real-world deployments is challenged by sensor noise, motion artifacts, and distribution shifts between training and deployment data. Inference-time augmentation (ITA), which applies augmentations during inference rather than retraining, offers a simple, model-agnostic mechanism to improve robustness. However, ITA application to…
Xiao Hu Team ArXiv
2026-06-08 ReCoVLA: VLM-Guided Reward Compilation for Failure Recovery in Vision-Language-Action Policies Dexterous Manipulation VLA Sim2Real
Vision-language-action (VLA) policies provide strong priors for language-conditioned manipulation, but remain brittle in off-nominal states requiring targeted recovery. We propose ReCoVLA – a failure-conditioned residual recovery framework that keeps a pretrained VLA policy frozen, uses an external vision-language model (VLM) to infer the failure mode and recovery stage, and compiles a…
Toshiaki Koike-Akino Team ArXiv
2026-06-08 Autonomous Obstacle Removal for Excavators through Policy Learning with Particle Simulation Sim2Real
Autonomous obstacle removal from the ground is an important earthwork task, but this is difficult to automate because an excavator must adapt its excavation trajectories over repeated cycles as soil-obstacle conditions change. Learning such state-dependent behavior requires a training environment that reproduces accumulated soil-obstacle interactions, including contact states, terrain…
Takamitsu Matsubara Team ArXiv
2026-06-08 Bridged SBI: Correcting Biased Low-Fidelity Posteriors for Cost-Efficient High-Fidelity Inference Sim2Real
Accurate calibration of particle-based simulators is crucial for robotic earthwork simulation, but analytical calibration is challenging due to this task’s highly nonlinear particle dynamics and the black-box nature of conventional simulators. Although simulation-based inference (SBI) can estimate posterior distributions over simulation parameters solely from forward simulations, applying SBI…
Takamitsu Matsubara Team ArXiv
2026-06-08 RealMath-Eval: Why SOTA Judges Struggle with Real Human Reasoning Sim2Real
While Large Language Models (LLMs) have achieved near-perfect performance in \emph{solving} high-school mathematics, their ability to \emph{evaluate} the diverse reasoning processes of real human students remains under-examined. To bridge this gap, we introduce \textbf{RealMath-Eval}, a rigorously annotated benchmark of 224 real-world exam responses from high schools. Our initial evaluation…
Xiangfeng Wang Team ArXiv / Web
2026-06-08 ABot-Earth 0.5: Generative 3D Earth Model Sim2Real
We present ABot-Earth 0.5, a generative 3D framework designed to synthesize vast, seamless 3D environments from ubiquitous, geospatially referenced satellite imagery. To achieve this, we propose a novel generative model formulated directly with the 3D Gaussian Splatting (3DGS) representation. The model is trained on a diverse corpus of existing real-world urban reconstructions, learning to…
Hang Zhang Team ArXiv / Web
2026-06-07 Video2Sim2Real: Full-Stack Autonomous Dexterous Skill Acquisition from a Single Human Video Sim2Real
Human manipulation videos are a convenient and intuitive source for robot learning. However, directly transferring human dexterity to robots remains challenging due to perception errors and embodiment gap. To address this, we introduce Video2Sim2Real, a full-stack framework for autonomous skill acquisition from a single human manipulation video. Our framework first uses off-the-shelf foundation…
Harish Ravichandar Team ArXiv / Web
2026-06-07 IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking Sim2Real
Simulation plays a key role in automated robotics research supported by large language models (LLMs). However, existing simulators often require custom code or complex interfaces, creating a barrier to rapid prototyping and automated algorithm development. To this end, we propose the Intelligent Robot Simulator (IR-SIM), a lightweight skill-native navigation simulator designed for rapid scenario…
Hengshuang Zhao Team ArXiv / Web
2026-06-07 PhysGraph: A Physics-aware 3D Scene Graph for Perception and Reasoning Sim2Real
To perform a wide range of daily tasks, robots need to construct a 3D representation that is semantically rich, physically grounded, and structured enough to support task planning and affordance prediction. However, existing approaches primarily focus on semantic retrieval, often overlooking physical and kinematic factors. Methods that attempt to model physical properties typically rely on narrow…
Xianyi Cheng Team ArXiv
2026-06-07 HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning Sim2Real
Reinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real settings, but its broader adoption remains limited by the engineering pipeline surrounding the algorithms. Building tasks, shaping rewards, and tuning hyperparameters require substantial expert effort, making RL workflows costly and difficult to scale. We introduce HARBOR, an agentic…
Georgia Chalvatzaki Team ArXiv
2026-06-07 OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation Sim2Real LearnedControl
Recent progress in robot manipulation has been largely driven by learning from large-scale demonstrations. For humanoid robot loco-manipulation tasks, however, existing data sources force an unsatisfying tradeoff between trajectory quality and scalability. Real-world teleoperation provides the highest-quality trajectories but requires dedicated physical space and time-consuming scene resets….
Xuelong Li Team ArXiv / Web
2026-06-07 Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning Sim2Real
Unmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real-world applications, creating a growing demand for autonomous payload acquisition and delivery. Existing approaches typically assume pre-attached payloads or rely on specialized grippers, leaving versatile end-to-end aerial delivery largely unresolved, where different payloads induce…
Yang Yu Team ArXiv
2026-06-06 G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation Sim2Real
Recovering the relative 6-DoF pose between two image groups underlies cross-sequence relocalization and multi-camera rig odometry. Each group carries known intra-group geometry from visual odometry or rig calibration, and pretrained multi-view backbones already fuse such geometry into visual features. Yet current models treat all views as an unstructured set, leaving cross-group reasoning as the…
Yanmei Jiao Team ArXiv
2026-06-06 Closing the Sim-to-Real Gap: An Evaluation Framework for Autonomous Cyber Defense Configuration of Commercial EDR Sim2Real
Leading commercial endpoint detection and response (EDR) products have shifted from operator-configured rule sets to multi-component systems where autonomous AI components operate alongside, and increasingly in place of, operator-deployed policies. Autonomous defense agents using commercial EDR as their hardening tool are no longer tuning a passive tool, but a black-box autonomous system capable…
Lilianne Brush Team ArXiv
2026-06-05 Simulation-Driven Imitation Learning for Biosignals-Free Shared-Autonomy Prosthetic Grasping Dexterous Manipulation Sim2Real
Biosignals-free shared-autonomy control of upper-limb prosthetic hands aims to enable natural and low-effort manipulation without relying on EMG or other physiological signals. Recent imitation-learning-based approaches have shown promising results, but their scalability is limited by the cost and variability of collecting large amounts of real-world human demonstration data. In this work, we…
Xianta Jiang Team ArXiv
2026-06-05 QuadVerse: An Integrated Framework Aligning Visual-Physical Reality for Quadruped Simulation Manipulation Sim2Real
Simulation is central to robot learning, yet the sim-to-real gap remains a major bottleneck.Existing approaches often tackle visual or dynamic gaps separately, overlooking how these individual mismatches accumulate and propagate throughout the robot’s state evolution.In this paper, we introduce QuadVerse, an integrated framework that uses reconstructed scenes as a calibration substrate for…
Jin Xie Team ArXiv
2026-06-05 Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification Sim2Real
Turkish idiomatic light verb constructions (LVCs) are challenging for multiword expression processing because they often share the same surface form as fully literal verb-object combinations while functioning as a single, partially idiomatic predicate. We frame Turkish LVC detection as a binary classification task (literal meaning vs. idiomatic meaning) and evaluate on a manually created…
Yusuf Şimşek Team ArXiv
2026-06-05 The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective Sim2Real
Foundation model agents are increasingly deployed for real-world decision-making, but suffer from the sim-to-real gap. While robotics and classical control have mature frameworks to address this gap, the foundation model community is treating agent robustness as an entirely novel phenomenon. Our paper proposes formalizing the foundation model agent evaluation and training gap as a classical…
Hua Wei Team ArXiv
2026-06-05 Learning All-Terrain Locomotion for a Planetary Rover with Actively Articulated Suspension Sim2Real
This paper presents ERNEST, a four-wheeled planetary rover concept equipped with a two-degree-of-freedom Active Gimbal Suspension that combines yaw and roll actuation to enable wheel reconfiguration, steering, and active load redistribution. A single neural network controller, trained to track a desired path across challenging terrain, fully unlocks the capabilities of this actuated suspension…
Hari Nayar Team ArXiv
2026-06-05 C3VD-DEFCOL: A Deformable Colonoscopy Dataset with Time-Resolved 3D Ground Truth and Realistic Appearance Sim2Real
3D reconstruction could improve colonoscopy by estimating mucosal coverage and alerting clinicians to missed regions during screening. However, algorithm development is limited as no current datasets provide both a realistic in vivo appearance and dense, time-resolved 3D ground truth, especially under non-rigid deformation. We present C3VD-DEFCOL, a framework and dataset for evaluating deformable…
Nicholas J. Durr Team ArXiv
2026-06-05 Length-resolved Operator Growth and Path-Entropy Obstructions to Many-Body Localization Sim2Real
For the disordered Ising chain with transverse and longitudinal fields, where couplings and fields are drawn from strictly positive distributions, Cao~\cite{Cao} has shown that the moments $μ_{2k} = |[H,σ^z_0]^{(k)}|2^2$ grow almost factorially, $μ{2k}^{1/(2k)}\sim k/\ln k$, and thus asymptotically at the maximal allowed rate. We generalize this result by resolving the operator norm in…
J. Sirker ArXiv
2026-06-04 TAM: Torque Adaptation Module for Robust Motion Transfer in Manipulation Dexterous Sim2Real
A policy tuned for one robot often behaves differently on another, whether due to the sim-to-real gap, unknown payloads, or the differing dynamics of two instances of the same robot. In contact-rich, dynamic manipulation, even small motion discrepancies can result in failure to track reference motion, since they disrupt the timing and modes of contact. Common remedies, such as domain…
Dieter Fox Team ArXiv
2026-06-04 Towards Realistic 3D Sonar Simulation Sim2Real
As underwater robotics research increasingly addresses complex 3D perception and autonomous navigation, the fidelity of sonar simulation has become a key factor in algorithm development. Current simulation frameworks typically rely on geometry-driven rendering, approximating 3D sonar as an underwater equivalent to LiDAR, which fails to account for fundamental acoustic phenomena such as…
Enrico Simetti Team ArXiv
2026-06-04 LadderMan: Learning Humanoid Perceptive Ladder Climbing Sim2Real
Humanoid robots hold great promise for operating in human-centered environments, yet ladder climbing remains one of the most challenging tasks due to sparse footholds and handholds, complex whole-body coordination, and sensitivity to perception and control errors. We present \textbf{LadderMan}, a unified system that enables humanoid robots to robustly climb diverse ladders and perform…
Guanya Shi Team ArXiv
2026-06-03 GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors Dexterous Sim2Real LearnedControl
Scaling humanoid loco-manipulation requires robot-compatible demonstrations across diverse objects, whole-body motions, and scene geometries, but teleoperation and motion capture are difficult to scale because each collection depends on physical setups, instrumented actors, and robot operation. We present GRAIL, a digital generation pipeline that remains fully virtual until deployment: it…
Ye Yuan Team ArXiv / Web
2026-06-03 M3imic: Learning a Versatile Whole-Body Controller for Multimodal Motion Mimicking Dexterous Sim2Real LearnedControl
Building a general-purpose whole-body controller is essential for enabling diverse motion capabilities in humanoid robots across a wide range of downstream tasks, including locomotion and loco-manipulation. Different tasks rely on distinct motion reference modalities: locomotion primarily depends on coordinated robot joint trajectories, whereas manipulation requires precise end-effector…
Shengbo Eben Li Team ArXiv
2026-06-03 Generalization of World Models under Environmental Variability for Vision-based Quadrotor Navigation Sim2Real
World models, learned generative models that predict how an environment evolves, have become a promising tool for sample-efficient robot learning. Yet how robust they are to environmental variability remains poorly understood. To address this, we conduct a systematic study using vision-based quadrotor navigation as a testbed problem, training DreamerV3-based world models under varying levels of…
Kostas Alexis Team ArXiv
2026-06-03 WAM-Nav: Asymmetric Latent World-Action Modeling for Unified Visual Navigation Sim2Real
Visual navigation requires generating smooth and collision-free trajectories under complex geometric and physical constraints. Existing reactive policies that directly map observations to actions lack anticipatory reasoning, limiting their ability to proactively avoid obstacles. While visual imagination offers predictive foresight, conventional modular approaches separate scene prediction from…
Nianfeng Liu Team ArXiv
2026-06-03 MoDex: A Diffusion Policy for Sequential Multi-Object Dexterous Grasping Sim2Real
This work addresses sequentially grasping multiple objects with a single dexterous hand without releasing those already held. Most dexterous grasping methods commit all of the hand’s degrees of freedom to a single object, underutilizing its dexterity and leaving no redundancy for subsequent grasps. The proposed solution, MoDex, is a diffusion policy that predicts the next gripper pose directly…
Danica Kragic Team ArXiv
2026-06-03 Learning Manifold and Itô Dynamics with Branched Neural Rough Differential Equations Sim2Real
Neural rough differential equations (NRDEs) stay accurate under irregular sampling while taking far fewer integration steps than standard neural differential equations, summarising a finely sampled driver by its log-signature and advancing the hidden state over coarse intervals using the log-ODE method. This efficiency rests on the shuffle algebra, the algebraic counterpart of Stratonovich…
Andi Han Team ArXiv
2026-06-03 OLIVE: Online Low-Rank Incremental Learning for Efficient Adaptive Exoskeletons Sim2Real
Wearable exoskeleton systems hold promise for restoring mobility in individuals with physical impairments, yet most existing controllers rely on static gait policies that lack the ability to adapt to dynamic real-world environments or individual user characteristics. We present \olive (\underline{O}nline \underline{L}ow-rank \underline{I}ncremental Learning for Efficient Adapti\underline{ve}…
Ying Nian Wu Team ArXiv
2026-06-02 GPU-Parallel Multi-Task Reinforcement Learning with Demonstration Guided Policy Optimization Manipulation Sim2Real
Large scale GPU-parallel reinforcement learning has changed what can be trained in robot simulation, yet most systems still optimize one specialist policy per task. We propose a construction methodology for turning structured manipulation task families into GPU-parallel multi-task RL benchmarks, and instantiate it as MT-Libero using LIBERO assets and task predicates in Isaac Lab. The resulting…
Weihua Zhang Team ArXiv
2026-06-02 Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation Sim2Real
Deep reinforcement learning has shown strong potential for enabling autonomous robots to learn complex navigational tasks. However, its practical use still depends heavily on human designed reward functions and repeated manual fine tuning, which is time consuming and does not guarantee high success in the desired task. This paper presents AgenticRL, agent guided reinforcement learning framework…
Dzmitry Tsetserukou Team ArXiv
2026-06-02 Speech Emotion Recognition using Attention-based LSTM-Network with Residual Connection Sim2Real
Speech emotion recognition is an important component of modern human-computer interaction systems. However, many state-of-the-art approaches rely on large pretrained models with high computational and memory requirements, limiting their applicability. This paper proposes ResLSTM-SA, a lightweight architecture that integrates residual connections with soft attention within an LSTM-based framework….
Maxim Vashkevich Team ArXiv
2026-06-02 SplitAdapter: Load-Aware Humanoid Loco-Manipulation via Factorized Adaptation Sim2Real LearnedControl
Humanoid loco-manipulation requires stable whole-body control under varying object masses and pickup/placement heights. This becomes particularly challenging in sim-to-real transfer, where object-induced load variation and robot-side dynamics mismatch interact during physical contact. Existing history-based adapters often compress these factors into a single latent representation, which can…
Donghan Koo Team ArXiv
2026-06-02 AirDreamer: Generalist Drone Navigation with World Models Sim2Real
Navigating a drone in unseen and cluttered environments requires reliable generalization to unseen scene layouts and understanding of environmental structure relative to the robot’s capabilities. Previous methods, which assume the same environment configuration, often rely heavily on human-designed perception pipelines and predefined rules to guide the robot toward the target. This process is…
Guyue Zhou Team ArXiv
2026-06-02 Reinforcement Learning from Cross-domain Videos with Video Prediction Model Sim2Real
Reinforcement learning from expert videos across visually distinct domains is challenging due to the absence of reward signals and the presence of domain gaps. We introduce XIPER (Cross-domain Video Prediction Reward), a reward model for learning from expert videos collected in a visually different domain, where the agent’s appearance differs due to factors such as color, morphology, or the…
Vincent François-Lavet Team ArXiv
2026-06-02 Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group Sim2Real
A latent world model built from an equivariant encoder $E$ and an equivariant predictor $f$ inherits a provable symmetry of its training loss: when the world’s dynamics genuinely carries a group $G$ acting on latents by an orthogonal representation $ρ(g)$, the one-step prediction relMSE is exactly invariant across the whole group, so fitting the dynamics on a restricted slice of orientations…
Hongbo Wang ArXiv
2026-06-02 DLO-Lab: Benchmarking Deformable Linear Object Manipulations with Differentiable Physics Sim2Real
We address the challenge of enabling robots to manipulate deformable linear objects (DLOs), such as ropes, cables, and rubber bands. Prior work has primarily focused on narrow, task-specific problems, often relying on real-world demonstrations or handcrafted heuristics. Such approaches, however, struggle to scale to the wide variety of materials and tasks encountered in practice, and collecting…
Chuang Gan Team ArXiv / Web
2026-06-01 Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization Dexterous Sim2Real
Precise parametric control over circuit geometry is essential for semiconductor inspection, yet obtaining sufficient real training data remains costly. Although generative models such as diffusion models and Generative Adversarial Networks (GANs) can augment training data, they cannot guarantee the nanometer-scale geometric accuracy required for metrology tasks. We propose a visual program…
Tatsuya Sasaki Team ArXiv
2026-06-01 Symmetry-Aware 9D Pose Estimation with Sim(3)-Consistent Feature and Spherical Inception Convolution Dexterous Sim2Real
Object pose estimation is a fundamental problem for an agent system to perceive or manipulate objects in images or videos. However, current instance-level methods struggle with generalization to unseen objects. Category-level methods seek to address this, but remain constrained by the complexities of learning in the non-linear Sim(3) space and intra-class variations. To address these challenges,…
Naveed Akhtar Team ArXiv
2026-06-01 A Simulation Platform for Flapping-Wing Vehicles Sim2Real
Flapping-wing aerial vehicles (FWAVs) demonstrate remarkable agility but face substantial autonomy challenges due to their high sensitivity to aerodynamic disturbances and limited sensor payload capacity. Current simulation platforms typically rely on oversimplified laminar flow assumptions and idealized sensor models, failing to capture the complex turbulence patterns and perceptual limitations…
Tomi Westerlund Team ArXiv
2026-06-01 A combination of noise and bilateral filters achieve supralinear and scalable adversarial robustness in CNNs Sim2Real
The vulnerability of deep neural networks to adversarial examples poses a significant challenge for real-world deployment. Existing techniques to enhance deep network robustness rely on adversarial training, an approach that is powerful but computationally intensive and typically tailored to specific attack types. To address these limitations, existing works have explored techniques such as…
Pau Vilimelis Aceituno Team ArXiv
2026-06-01 PINNOCHIO: Physics-Informed Neural Network for Coupled Hyperelastic Interface-Volume Simulation in Orthognathic Surgery Sim2Real
Predicting patient-specific facial soft-tissue deformation is critical for iterative orthognathic surgery planning. However, current computational methods face a strict accuracy-efficiency trade-off: high-fidelity Finite Element Methods (FEM) are computationally prohibitive, whereas pure deep learning models often produce biomechanically inconsistent results. While Physics-Informed Neural…
Pingkun Yan Team ArXiv
2026-06-01 SCOPE: Real-Time Natural Language Camera Agent at the Edge Sim2Real
Deploying language-driven agents in robotics requires evaluations that reflect real-world task demands: natural-language instructions with reproducible outcomes. Such agents must connect language models to callable perception and control tools, and be assessed using deployment-critical metrics including latency, accuracy, and error modes. We present SCOPE (Simulation and Camera Operations for…
Pragyana Mishra Team ArXiv / Web
2026-06-01 RESCAST-100K: A Comprehensive Dataset for Cross-Domain Residential Load and Indoor Temperature Forecasting Sim2Real
Accurate short-term forecasting of residential energy load and indoor temperature is essential for home energy management systems, grid-level demand response, and community energy efficiency efforts. Domain adaptation and transfer learning have shown promise for improving forecasting accuracy under data heterogeneity and scarcity commonly seen in residential settings. However, progress is limited…
Simone Silvestri Team ArXiv
2026-05-31 Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX Sim2Real
High-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While aerial robotics simulators have evolved to support specialized needs such as fidelity, differentiability, and swarms independently, a unified platform that can synthesize data across all these domains is missing. In this work, we propose Crazyflow, a…
Angela P. Schoellig Team ArXiv
2026-05-31 Chirality-free photon routing via giant atoms in waveguide QED ladders Sim2Real
In this paper, we present an in-depth examination of single-photon routing in a multi-emitter waveguide quantum electrodynamics ladder with up to five giant atoms simultaneously coupled across two linear waveguides. Using a real-space approach, we analyze a non-chiral routing architecture and evaluate the impact of scaling the number of giant atoms in three topologically distinct configurations:…
Imran M. Mirza Team ArXiv
2026-05-31 PSF-like Alpha-Particle Events in LSST Images Sim2Real
Rare $α$-particle-induced charge clusters appear in LSST images as compact, PSF-like sources with a median FWHM of $0.!!^{\prime\prime}95$ and median ellipticity consistent with zero, closely resembling unresolved astrophysical point sources. These events are detected in both dark and science exposures at a rate of approximately $10^{-12}\ \mathrm{pixel}^{-1}\ \mathrm{s}^{-1}$. Their collected…
Eli S. Rykoff Team ArXiv
2026-05-30 Global-Local Attention Decomposition for Terrain Encoding in Humanoid Perceptive Locomotion Sim2Real
Although reinforcement learning has significantly advanced humanoid locomotion, perceptive policies still struggle on sparse-foothold terrain and constrained environments. Success in these scenarios requires both broad terrain awareness and precise foothold selection, two perceptual roles that conventional encoders often entangle. To address this challenge, we propose Global-Local Attention…
Yue Gao Team ArXiv
2026-05-30 A Four-Tier Communication Architecture and Sim-to-Real Validation of a Graphical Open-Source Platform for Robotic Engineering Education Sim2Real
The persistent challenge in scaling authentic manipulator education within university laboratories is a structural dichotomy: commercial digital twins are often cost-prohibitive and rigidly scripted, whereas open-source robotics middleware (ROS) imposes steep technical and syntax barriers for novices. To resolve this logistical and educational friction, this Work-in-Progress (WiP) paper proposes…
Jiong Jin Team ArXiv
2026-05-30 Too Much of a Good Thing: When sim2real Efforts Impede Policy Learning (And What to Do About It) Sim2Real
While sim2real efforts are necessary for effective policy transfer to hardware, there is such a thing as too much of a good thing. We argue that sim2real efforts have led to misaligned incentives with policy learning, resulting in simulator lock in and poor policy exploration due to the unreasonable constraints imposed by the real world. We offer a diagnosis and explanation of the current status…
Luis Sentis Team ArXiv
2026-05-29 RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning VLA Sim2Real
Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robot control. However, their performance remains fundamentally constrained by the availability of high-quality robot trajectory data. In current robot learning practice, such data are primarily collected through human teleoperation, which is labor-intensive, costly, and difficult to scale. In this paper,…
Xinhai Sun Team ArXiv
2026-05-29 Feat2Go: Visual Feature-Grounded Value Estimation for Embodied Reinforcement Learning VLA Sim2Real
Reinforcement learning is a promising approach for improving the capabilities of vision-language-action (VLA) models while avoiding the heavy data requirements of imitation learning. However, its effectiveness for VLA models is often constrained by sparse supervision and the difficulty of designing informative reward signals for long-horizon manipulation. In this work, we present Feat2Go, a…
Yongtao Wang Team ArXiv
2026-05-29 Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin Sim2Real Tactile
We introduce and solve the novel task of controlled separation of small objects with two fingers of a multi-purpose robotic hand: after grasping into a box of small objects, the task is to drop as many of them until a desired number remains between the fingers. The objects are small compared to the width of the fingers but also in absolute terms. In our case little pellets with a diameter of only…
Berthold Bäuml Team ArXiv
2026-05-29 Batched Differentiable Rigid Body Dynamics in PyTorch for GPU-Accelerated Robot Learning Sim2Real
As robot control shifts toward large-scale reinforcement learning with in-loop dynamics computation, the community’s reliance on CPU-bound libraries such as Pinocchio creates a throughput bottleneck in GPU-based training pipelines. We present BARD (Batched Articulated Rigid-body Dynamics), a self-contained PyTorch implementation of Featherstone’s rigid-body dynamics algorithms, optimized for…
Zhaoxing Li Team ArXiv
2026-05-29 Dirac-Phase CP-Violation in the Low-Scale Type-I Seesaw with Three Right-Handed Neutrinos Sim2Real
We study the low-scale type-I seesaw with three right-handed neutrinos (i.e. heavy Majorana neutrinos) when the CP-violation arises solely from the low-energy Dirac phase $δ$ of the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) neutrino mixing matrix and the heavy neutrinos have testable mixings. We derive a CP-conserving and non-real structure of the $3\times 3$ orthogonal matrix entering the…
S. T. Petcov Team ArXiv
2026-05-29 Scaling Multi-Hop Training Data via Graph-Constrained Path Selection Sim2Real
Endowing large language models with compositional reasoning over specialized documents requires multi-hop training data at scale, where such data rarely exists outside of curated benchmarks built on structured sources. To construct it directly from plain, unannotated text, existing methods ask a single teacher model to jointly discover an evidence path through a document and verbalize it as a…
Yike Guo Team ArXiv
2026-05-29 Robust class-gated single-pixel diffractive optical neural network with random-aberration-aware training Sim2Real
Optical computing offers the theoretical potential for high-speed, energy-efficient inference, yet its practical deployment remains constrained by fundamental input-output bottlenecks, particularly the reliance on electronic sensors with limited frame rates and stringent alignment requirements between optical components. Here, we demonstrate an image-class-gated single-pixel DONN that overcomes…
Jun-Jun Xiao Team ArXiv
2026-05-29 TALON: Token-Aligned Lightweight Adapters for 6-DoF Spacecraft Pose Estimation Sim2Real
Monocular 6-DoF spacecraft pose estimation methods predominantly process individual frames, discarding the temporal information present in an image sequence acquired during spacecraft manoeuvres. Few temporal approaches require full backbone fine-tuning or auxiliary optical flow networks, risking catastrophic forgetting or increasing computational cost, respectively. We propose TALON…
Djamila Aouada Team ArXiv
2026-05-29 QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer Sim2Real
Estimating 3D attributes directly from images has advanced rapidly with the Visual Geometry Grounded Transformer (VGGT), which predicts camera parameters, depth maps, and point clouds in a single forward pass. However, its 1.2B-parameter scale severely limits deployment on resource-constrained platforms such as UAVs and mobile AR devices. To address this limitation, we introduce QVGGT, a tailored…
Huan Wang Team ArXiv / Web
2026-05-29 Imaging the Magnetically Driven Reconstruction of the Electronic States in the Antiferromagnetic Topological Insulator EuSn$_2$As$_2$ Sim2Real
The realization of the axion insulator phase in magnetic topological insulators is often hindered by crystalline symmetries that protect gapless surface states, even when time-reversal symmetry is broken. Here, we use variable-temperature scanning tunneling microscopy (STM) and spectroscopy (STS), complemented with density functional theory (DFT), to investigate the local electronic structure of…
Pegor Aynajian Team ArXiv
2026-05-29 DRL-Based Pose Control for Double-Ackermann Robots Under Actuation Uncertainties Sim2Real
Robust deployment of deep reinforcement learning (DRL) policies on real robots remains challenging due to discrepancies between simulation and real-world dynamics. We address this issue in the context of maneuvering with double-Ackermann-steering mobile robots, which introduce additional constraints due to their non-holonomic nature. Building upon the DRL framework ManeuverNet, we extend its…
Olivier Ly Team ArXiv
2026-05-28 YoCausal: How Far is Video Generation from World Model? A Causality Perspective HF-Hot Sim2Real
As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly rely on synthetic data, limiting real-world generalization due to the sim-to-real gap. We present YoCausal, a two-level benchmark inspired by the Violation of Expectation (VoE) paradigm from…
Zhixiang Wang Team ArXiv
2026-05-28 VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation Manipulation Sim2Real
When using reinforcement learning (RL) for contact-rich robotic manipulation, vision can provide task-relevant information that accelerates learning beyond what proprioception alone can achieve. However, vision-enabled policies tend to overfit to the visual conditions seen during training, limiting their robustness and transferability. We present a human-in-the-loop RL framework that employs…
Dongheui Lee Team ArXiv
2026-05-28 PhAIL: A Real-Robot VLA Benchmark and Distributional Methodology VLA Sim2Real
Real-world evaluation of vision-language-action (VLA) policies still rests on binary success rate at a fixed timeout with $N \le 25$ rollouts per condition, almost always without confidence intervals or paired statistical comparison; these cohort sizes struggle to resolve close comparisons reliably. We introduce PhAIL (Physical AI Leaderboard, https://phail.ai), an open real-robot benchmark on a…
Sergey Arkhangelskiy ArXiv / Web
2026-05-28 Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes Sim2Real HF-Hot
Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under…
Seung-Kyum Choi Team ArXiv
2026-05-28 CoMo3R-SLAM: Collaborative Monocular Dense SLAM with Learned 3D Reconstruction Priors for Outdoor Multi-Agent Systems Sim2Real
Collaborative dense SLAM is essential for multi-robot teams to achieve scalable and consistent 3D perception across large-scale outdoor environments. Existing systems typically depend on depth sensors, incurring significant payload, power, and calibration costs. Monocular RGB cameras are a lightweight alternative, but collaborative monocular dense SLAM remains difficult due to scale ambiguity,…
Baoru Huang Team ArXiv
2026-05-28 Battery-Sim-Agent: Leveraging LLM-Agent for Inverse Battery Parameter Estimation Sim2Real
Parameterizing high-fidelity “digital twins” of batteries is a critical yet challenging inverse problem that hinders the pace of battery innovation. Prevailing methods formulate this as a black-box optimization (BBO) task, employing algorithms that are sample-inefficient and blind to the underlying physics. In this work, we introduce a new paradigm that reframes the inverse problem as a reasoning…
Jiang Bian Team ArXiv
2026-05-27 Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation Dexterous Manipulation Sim2Real
A primary bottleneck in contact-rich manipulation is the difficulty of collecting real-world data. Sim-to-real reinforcement learning offers a scalable alternative, but the simulation-reality gap prevents information-dense modalities like touch from being effectively used. Existing sim-to-real methods often mitigate this gap by simplifying tactile data into coarse low-dimensional features –…
Toru Lin Team ArXiv / Web
2026-05-27 The Best-Laid SCHEMEs: Coordinated Sabotage and Monitoring in Multi-Agent Systems Sim2Real
As agentic coding systems decompose work across multiple model instances, a critical safety question is whether those instances can coordinate to achieve a hidden malicious objective while remaining aligned with user intent. We introduce SCHEME, a benchmark of 17 task instances across 7 settings and 8 real open-source libraries, each pairing a legitimate software-engineering task with a covert…
Pablo Bernabeu-Pérez Team ArXiv
2026-05-27 Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics Sim2Real
Event-driven scheduling policies are increasingly deployed in industrial environments, where decisions are made under asynchronous and partially observed system states. As a result, decision states are not temporally consistent, action admissibility is not explicitly defined, and the origin of execution errors remains ambiguous. These issues limit both reliability and interpretability. To…
Noah Klarmann Team ArXiv
2026-05-27 SPRINT: Efficient Spectral Priors for Humanoid Athletic Sprints Sim2Real
The pursuit of humanoid athletic sprints is hindered by a scarcity of humanoid-viable kinematic reference data and the inability of existing frameworks to maintain stability during sprints. To overcome these limitations, we introduce SPRINT, a novel framework driven by efficient, frequency-adaptive spectral priors. By characterizing the fundamental periodicity of human locomotion in the frequency…
Huimin Lu Team ArXiv
2026-05-27 POINav: Benchmarking and Enhancing Final-Meters Arrival in Real-World Vision-Language Navigation Sim2Real
Real-world navigation is fundamentally driven by Points of Interest (POIs), yet reaching a precise POI remains a critical “final-meters” challenge. Existing Vision-Language Navigation (VLN) benchmarks of POI-goal navigation often suffer from coarse granularity or significant sim-to-real gaps due to generated scene. To bridge this gap, we present POINav-Bench, the first benchmark designed for…
Mu Xu Team ArXiv
2026-05-26 Transferable Reinforcement Learning via Probabilistic Latent Embeddings and Dynamic Policy Adaptation for Sim-to-Real Deployment Sim2Real
Due to limited resources and public safety concerns, deep reinforcement learning (RL) agents for many cyber-physical systems (e.g., autonomous vehicles) are first trained in simulators. However, when deployed in real world environments, they often suffer from performance degradation or safety violations because of the inevitable Sim2Real gap. Existing zero-shot approaches, such as robust safe RL…
Yiheng Feng Team ArXiv
2026-05-25 VesselSim: learning 3D blood vessel segmentation without expert annotations Sim2Real
Blood vessel segmentation is a core task in medical image analysis for the care of vascular diseases and surgical planning, yet the challenges of providing expert vascular annotations pose a major obstacle for the progress of related deep learning techniques. To address this, we propose VesselSim, a two-stage framework for universal 3D blood vessel segmentation that eliminates the need for real…
Yiming Xiao Team ArXiv
2026-05-24 MuJoCoUni:Persistent Batched Runtime Primitives for MuJoCo Sim2Real
We present MuJoCoUni, a downstream MuJoCo distribution for online robot learning and batched physics evaluation. Alongside the open-loop batched trajectory generation already provided by upstream mujoco.rollout, MuJoCoUni supplies runtime primitives for stateful environment execution. The target workloads need high-throughput parallel execution while retaining upstream CPU MuJoCo semantics for…
Junzhe Wu Team ArXiv
2026-05-23 Vision-Guided Outdoor Flight and Obstacle Evasion via Reinforcement Learning Sim2Real
Although quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous pilot control in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles…
Avideh Zakhor Team ArXiv
2026-05-22 Robotic Strawberry Harvesting with Robust Vision and Deep Reinforcement Learning based Sim-to-Real Control Sim2Real
This study presents a closed-loop robotic strawberry harvesting system that combines a robust vision module, simulation-trained deep reinforcement learning (DRL) control, and ROS-based realrobot execution. For perception, we propose HRAttnEdge-YOLO26-seg, a modified YOLO26-seg architecture that incorporates a high-resolution P2 branch, segmentation-path attention, and edgesupervised prototype…
Azlan Zahid Team ArXiv
2026-05-21 CoRMA: Contrastive RMA for Contact-Rich Meta-Adaptation Sim2Real
We present CoRMA(Contrastive Robotic Motor Adaptation), a context-based meta-adaptation framework that modifies RMA for force-dominant assembly. CoRMA replaces raw simulator-parameter adaptation with a compact 6D simulator-only semantic contact context describing contact onset, lateral engagement, guided transition, contact direction, and jamming. A deployable causal Transformer adapter infers…
Jianqiao Zhu Team ArXiv
2026-05-15 DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo HF-Hot Sim2Real
Achieving human-level manipulation requires dexterous robotic hands capable of complex object interactions. Advancing such capabilities further demands standardized benchmarks for systematic evaluation. However, existing dexterous benchmarks lack tasks that reflect the unique manipulation capabilities of dexterous hands over parallel grippers, as well as comprehensive evaluation pipelines. In…
Tieniu Tan Team ArXiv
2026-05-15 Adaptive Outer-Loop Control of Quadrotors via Reinforcement Learning Sim2Real
Deep Reinforcement Learning (DRL) for quadrotor flight control typically relies on Domain Randomization (DR) for sim-to-real transfer, resulting in overly conservative policies that struggle with dynamic disturbances. To overcome this, we propose a novel adaptive control architecture that actively perceives and reacts to instantaneous perturbations. First, we train an optimal outer-loop policy,…
Moble Benedict Team ArXiv
2026-05-15 parallelcbf: A composable safety-filter and auditability framework for tensor-parallel reinforcement learning Sim2Real
While Isaac Lab provides massive parallel UAV simulation, OmniSafe and safe-control-gym provide constrained-RL benchmarks, and CBFKit provides control-barrier-function synthesis tooling, no existing framework unifies these capabilities for end-to-end safety-constrained training. ParallelCBF is the first framework to unify (i)~tensor-parallel UAV environments, (ii)~hard-gate CBF safety filters,…
Yuyin Ma Team ArXiv
2026-05-12 NavOL: Navigation Policy with Online Imitation Learning Manipulation Sim2Real
Learning robust navigation policies remains a core challenge in robotics. Offline imitation learning suffers from distribution shift and compounding errors at rollout, while reinforcement learning requires reward engineering and learns inefficiently. In this paper, we propose NavOL, an online imitation learning paradigm that interacts with a simulator and updates itself using expert…
Li Zhang Team ArXiv / Web
2026-05-12 When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents Sim2Real
Tool-use language agents are evaluated on benchmarks that assume clean inputs, unambiguous tool registries, and reliable APIs. Real deployments violate all these assumptions: user typos propagate into hallucinated tool names, a misconfigured request timeout can stall an agent indefinitely, and duplicate tool names across servers can freeze an SDK. We study these failures as a sim-to-real gap in…
Xiyang Hu Team ArXiv / Web
2026-05-10 Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching Dexterous Manipulation Sim2Real HF-Hot
Dexterous manipulation is physics-intensive and highly sensitive to modeling errors and perception noise, making sim-to-real transfer prohibitively challenging. Domain randomization (DR) is commonly used to improve the robustness of learned policies for such tasks, but conventional DR randomizes one instance per episode, offering very limited exposure to the variability of real-world dynamics. To…
Kaiyu Hang Team ArXiv
2026-05-08 TimeLesSeg: Unified Contrast-Agnostic Cross-Sectional and Longitudinal MS Lesion Segmentation via a Stochastic Generative Model Sim2Real
Multiple sclerosis (MS) expresses substantial clinical and radiological heterogeneity, which poses significant challenges for automatic lesion segmentation. The current deep learning-based SOTA is highly susceptible to changes in both distribution, e.g., changes in scanner; as well as the structure of inputs, evident in the current divide between cross-sectional and longitudinal approaches. We…
Ferran Prados Team ArXiv
2026-05-08 SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics Sim2Real
Autonomous-driving simulators typically trade physical fidelity for scalable parallelism. Physics-based platforms such as CARLA and MetaDrive provide articulated vehicle dynamics and contact, but their non-vectorized interfaces make batched training difficult. GPU-batched systems such as Waymax and GPUDrive scale to hundreds of scenarios by replacing rigid-body physics with simplified kinematic…
Zilin Bian Team ArXiv
2026-05-05 On Surprising Effects of Risk-Aware Domain Randomization for Contact-Rich Sampling-based Predictive Control Sim2Real
Domain randomization (DR) is widely used in policy learning to improve robustness to modeling error, but remains underexplored in contact-rich sampling-based predictive control (SPC), where rollout quality is highly sensitive to uncertainty. In this work, we take the first step by studying risk-aware DR in predictive sampling on a simple yet representative Push-T task, comparing average,…
Aaron D. Ames Team ArXiv
2026-05-04 SIAM: Head and Brain MRI Segmentation from Few High-Quality Templates via Synthetic Training Sim2Real
Synthetic training has recently advanced brain MRI segmentation by enabling contrast-agnostic models trained entirely on generated data. However, most existing approaches rely on hundreds of automatically labeled templates, introducing systematic biases and limiting their flexibility to incorporate new anatomical structures. We present the Segment It All Model (SIAM), a 3D whole-head segmentation…
Reuben Dorent Team ArXiv
2026-05-04 Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture Sim2Real
Autonomous surface vessels for floating-waste removal operate under varying hydrodynamics, external disturbances, and challenging water-surface perception. We present a field-validated system that combines camera-based polarimetric perception with a lightweight DRL-based controller for floating-waste detection and capture. Camera detections are converted into water-surface target points and…
Cédric Pradalier Team ArXiv
2026-05-03 DexSim2Real: Foundation Model-Guided Sim-to-Real Transfer for Generalizable Dexterous Manipulation Dexterous Sim2Real
Sim-to-real transfer remains a critical bottleneck for deploying dexterous manipulation policies learned in simulation to real-world robots. Existing approaches rely on manually designed domain randomization or task-specific adaptation, limiting their generalizability across diverse manipulation scenarios. We present DexSim2Real, an integrated framework that leverages vision-language foundation…
Yuhao Liao Team ArXiv
2026-05-01 Learning to Race in Minutes: Infoprop Dyna on the Mini Wheelbot Sim2Real
Reinforcement Learning (RL) has the potential to enable robots with fast, nonlinear, and unstable dynamics to reach the limits of their performance. However, most recent advances rely on carefully designed physics-based simulators and domain randomization to achieve successful sim-to-real transfer within reasonable wall-clock time. In this work, we bypass the need for such simulators and…
Sebastian Trimpe Team ArXiv
2026-04-27 AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation Sim2Real
While Vision-Language-Action (VLA) models have been demonstrated possessing strong zero-shot generalization for robot control, their massive parameter sizes typically necessitate cloud-based deployment. However, cloud deployment introduces network jitter and inference latency, which can induce severe spatiotemporal misalignment in mobile navigation under continuous displacement, so that the stale…
Mu Xu Team ArXiv
2026-04-26 Unleashing the Agility of Wheeled-Legged Robots for High-Dynamic Reflexive Obstacle Evasion Sim2Real
Wheeled-legged robots combine the energy efficiency of wheeled locomotion with the terrain adaptability of legged systems, making them promising platforms for agile mobility in complex and dynamic environments. However, enabling high-dynamic reflexive evasion against fast-moving obstacles remains challenging due to the hybrid morphology, mode coupling, and non-holonomic constraints of such…
Ce Hao Team ArXiv
2026-04-25 GreenDyGNN: Runtime-Adaptive Energy-Efficient Communication for Distributed GNN Training Sim2Real
Distributed GNN training is dominated by remote feature fetching, which can be very costly. Multi-hop neighborhood sampling crosses partition boundaries and triggers fine-grained RPCs whose fixed initiation cost and GPU-stall latency waste energy. Prior systems try to reduce this overhead with presampling and static caching, but cache policies cannot react to runtime network variation. We show…
M. S. Q. Zulkar Nine Team ArXiv
2026-04-23 Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection Sim2Real
Automating the detection of regulatory compliance remains a challenging task due to the complexity and variability of legal texts. Models trained on one regulation often fail to generalise to others. This limitation underscores the need for principled methods to improve cross-domain transfer. We study data selection as a strategy to mitigate negative transfer in compliance detection framed as a…
Dusica Marijan Team ArXiv