Updated on 2026.06.11
Usage instructions: here
Sim2Real
| Publish Date | Title & Abstract | Authors | Links |
|---|---|---|---|
| 2026-06-09 | A Practical Recipe Towards Improving Sim-and-Real Correlation for VLA Evaluation VLA Sim2RealSimulation has become an essential tool for evaluating and improving vision-language-action (VLA) policies, offering scalable, reproducible, and controllable alternatives to costly real-world robot evaluation. Recent simulation benchmarks have made substantial progress on realism and diversity, yet these platforms have not been widely adopted as reliable proxies for real-world policy evaluation…. |
Yang Gao Team | ArXiv |
| 2026-06-09 | Nonequilibrium Green Functions Simulations for Large Correlated Systems Sim2RealCorrelated real-time dynamics in large, spatially inhomogeneous quantum systems remain difficult to access with nonequilibrium many-body methods. Two-time nonequilibrium Green functions (NEGF) retain dynamical correlations but their computational runtime grows cubically with the number of time steps $N_\mathrm{t}$. This scaling bottleneck could recently be overcome by introducing the G1–G2… |
Jan-Philip Joost Team | ArXiv |
| 2026-06-09 | Updating the PATH framework with FRB host galaxy models Sim2RealOver a hundred fast radio burst (FRB) host galaxies have now been identified, enabling both comparisons of host redshift with FRB dispersion measure to study the cosmological distribution of ionised gas, and analyses of host properties in order to identify FRB progenitors. The standard method for determining the most likely FRB host galaxy in an optical image is the Bayesian framework… |
S. D. Ryder Team | ArXiv |
| 2026-06-09 | A Comprehensive Inference-Time Augmentation Framework in Physiological Signals: Application to PPG-Based AF Detection Sim2RealObjective: Accurate classification of physiological signals in real-world deployments is challenged by sensor noise, motion artifacts, and distribution shifts between training and deployment data. Inference-time augmentation (ITA), which applies augmentations during inference rather than retraining, offers a simple, model-agnostic mechanism to improve robustness. However, ITA application to… |
Xiao Hu Team | ArXiv |
| 2026-06-08 | ReCoVLA: VLM-Guided Reward Compilation for Failure Recovery in Vision-Language-Action Policies Dexterous Manipulation VLA Sim2RealVision-language-action (VLA) policies provide strong priors for language-conditioned manipulation, but remain brittle in off-nominal states requiring targeted recovery. We propose ReCoVLA – a failure-conditioned residual recovery framework that keeps a pretrained VLA policy frozen, uses an external vision-language model (VLM) to infer the failure mode and recovery stage, and compiles a… |
Toshiaki Koike-Akino Team | ArXiv |
| 2026-06-08 | Autonomous Obstacle Removal for Excavators through Policy Learning with Particle Simulation Sim2RealAutonomous obstacle removal from the ground is an important earthwork task, but this is difficult to automate because an excavator must adapt its excavation trajectories over repeated cycles as soil-obstacle conditions change. Learning such state-dependent behavior requires a training environment that reproduces accumulated soil-obstacle interactions, including contact states, terrain… |
Takamitsu Matsubara Team | ArXiv |
| 2026-06-08 | Bridged SBI: Correcting Biased Low-Fidelity Posteriors for Cost-Efficient High-Fidelity Inference Sim2RealAccurate calibration of particle-based simulators is crucial for robotic earthwork simulation, but analytical calibration is challenging due to this task’s highly nonlinear particle dynamics and the black-box nature of conventional simulators. Although simulation-based inference (SBI) can estimate posterior distributions over simulation parameters solely from forward simulations, applying SBI… |
Takamitsu Matsubara Team | ArXiv |
| 2026-06-08 | RealMath-Eval: Why SOTA Judges Struggle with Real Human Reasoning Sim2RealWhile Large Language Models (LLMs) have achieved near-perfect performance in \emph{solving} high-school mathematics, their ability to \emph{evaluate} the diverse reasoning processes of real human students remains under-examined. To bridge this gap, we introduce \textbf{RealMath-Eval}, a rigorously annotated benchmark of 224 real-world exam responses from high schools. Our initial evaluation… |
Xiangfeng Wang Team | ArXiv / Web |
| 2026-06-08 | ABot-Earth 0.5: Generative 3D Earth Model Sim2RealWe present ABot-Earth 0.5, a generative 3D framework designed to synthesize vast, seamless 3D environments from ubiquitous, geospatially referenced satellite imagery. To achieve this, we propose a novel generative model formulated directly with the 3D Gaussian Splatting (3DGS) representation. The model is trained on a diverse corpus of existing real-world urban reconstructions, learning to… |
Hang Zhang Team | ArXiv / Web |
| 2026-06-07 | Video2Sim2Real: Full-Stack Autonomous Dexterous Skill Acquisition from a Single Human Video Sim2RealHuman manipulation videos are a convenient and intuitive source for robot learning. However, directly transferring human dexterity to robots remains challenging due to perception errors and embodiment gap. To address this, we introduce Video2Sim2Real, a full-stack framework for autonomous skill acquisition from a single human manipulation video. Our framework first uses off-the-shelf foundation… |
Harish Ravichandar Team | ArXiv / Web |
| 2026-06-07 | IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking Sim2RealSimulation plays a key role in automated robotics research supported by large language models (LLMs). However, existing simulators often require custom code or complex interfaces, creating a barrier to rapid prototyping and automated algorithm development. To this end, we propose the Intelligent Robot Simulator (IR-SIM), a lightweight skill-native navigation simulator designed for rapid scenario… |
Hengshuang Zhao Team | ArXiv / Web |
| 2026-06-07 | PhysGraph: A Physics-aware 3D Scene Graph for Perception and Reasoning Sim2RealTo perform a wide range of daily tasks, robots need to construct a 3D representation that is semantically rich, physically grounded, and structured enough to support task planning and affordance prediction. However, existing approaches primarily focus on semantic retrieval, often overlooking physical and kinematic factors. Methods that attempt to model physical properties typically rely on narrow… |
Xianyi Cheng Team | ArXiv |
| 2026-06-07 | HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning Sim2RealReinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real settings, but its broader adoption remains limited by the engineering pipeline surrounding the algorithms. Building tasks, shaping rewards, and tuning hyperparameters require substantial expert effort, making RL workflows costly and difficult to scale. We introduce HARBOR, an agentic… |
Georgia Chalvatzaki Team | ArXiv |
| 2026-06-07 | OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation Sim2Real LearnedControlRecent progress in robot manipulation has been largely driven by learning from large-scale demonstrations. For humanoid robot loco-manipulation tasks, however, existing data sources force an unsatisfying tradeoff between trajectory quality and scalability. Real-world teleoperation provides the highest-quality trajectories but requires dedicated physical space and time-consuming scene resets…. |
Xuelong Li Team | ArXiv / Web |
| 2026-06-07 | Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning Sim2RealUnmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real-world applications, creating a growing demand for autonomous payload acquisition and delivery. Existing approaches typically assume pre-attached payloads or rely on specialized grippers, leaving versatile end-to-end aerial delivery largely unresolved, where different payloads induce… |
Yang Yu Team | ArXiv |
| 2026-06-06 | G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation Sim2RealRecovering the relative 6-DoF pose between two image groups underlies cross-sequence relocalization and multi-camera rig odometry. Each group carries known intra-group geometry from visual odometry or rig calibration, and pretrained multi-view backbones already fuse such geometry into visual features. Yet current models treat all views as an unstructured set, leaving cross-group reasoning as the… |
Yanmei Jiao Team | ArXiv |
| 2026-06-06 | Closing the Sim-to-Real Gap: An Evaluation Framework for Autonomous Cyber Defense Configuration of Commercial EDR Sim2RealLeading commercial endpoint detection and response (EDR) products have shifted from operator-configured rule sets to multi-component systems where autonomous AI components operate alongside, and increasingly in place of, operator-deployed policies. Autonomous defense agents using commercial EDR as their hardening tool are no longer tuning a passive tool, but a black-box autonomous system capable… |
Lilianne Brush Team | ArXiv |
| 2026-06-05 | Simulation-Driven Imitation Learning for Biosignals-Free Shared-Autonomy Prosthetic Grasping Dexterous Manipulation Sim2RealBiosignals-free shared-autonomy control of upper-limb prosthetic hands aims to enable natural and low-effort manipulation without relying on EMG or other physiological signals. Recent imitation-learning-based approaches have shown promising results, but their scalability is limited by the cost and variability of collecting large amounts of real-world human demonstration data. In this work, we… |
Xianta Jiang Team | ArXiv |
| 2026-06-05 | QuadVerse: An Integrated Framework Aligning Visual-Physical Reality for Quadruped Simulation Manipulation Sim2RealSimulation is central to robot learning, yet the sim-to-real gap remains a major bottleneck.Existing approaches often tackle visual or dynamic gaps separately, overlooking how these individual mismatches accumulate and propagate throughout the robot’s state evolution.In this paper, we introduce QuadVerse, an integrated framework that uses reconstructed scenes as a calibration substrate for… |
Jin Xie Team | ArXiv |
| 2026-06-05 | Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification Sim2RealTurkish idiomatic light verb constructions (LVCs) are challenging for multiword expression processing because they often share the same surface form as fully literal verb-object combinations while functioning as a single, partially idiomatic predicate. We frame Turkish LVC detection as a binary classification task (literal meaning vs. idiomatic meaning) and evaluate on a manually created… |
Yusuf Şimşek Team | ArXiv |
| 2026-06-05 | The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective Sim2RealFoundation model agents are increasingly deployed for real-world decision-making, but suffer from the sim-to-real gap. While robotics and classical control have mature frameworks to address this gap, the foundation model community is treating agent robustness as an entirely novel phenomenon. Our paper proposes formalizing the foundation model agent evaluation and training gap as a classical… |
Hua Wei Team | ArXiv |
| 2026-06-05 | Learning All-Terrain Locomotion for a Planetary Rover with Actively Articulated Suspension Sim2RealThis paper presents ERNEST, a four-wheeled planetary rover concept equipped with a two-degree-of-freedom Active Gimbal Suspension that combines yaw and roll actuation to enable wheel reconfiguration, steering, and active load redistribution. A single neural network controller, trained to track a desired path across challenging terrain, fully unlocks the capabilities of this actuated suspension… |
Hari Nayar Team | ArXiv |
| 2026-06-05 | C3VD-DEFCOL: A Deformable Colonoscopy Dataset with Time-Resolved 3D Ground Truth and Realistic Appearance Sim2Real3D reconstruction could improve colonoscopy by estimating mucosal coverage and alerting clinicians to missed regions during screening. However, algorithm development is limited as no current datasets provide both a realistic in vivo appearance and dense, time-resolved 3D ground truth, especially under non-rigid deformation. We present C3VD-DEFCOL, a framework and dataset for evaluating deformable… |
Nicholas J. Durr Team | ArXiv |
| 2026-06-05 | Length-resolved Operator Growth and Path-Entropy Obstructions to Many-Body Localization Sim2RealFor the disordered Ising chain with transverse and longitudinal fields, where couplings and fields are drawn from strictly positive distributions, Cao~\cite{Cao} has shown that the moments $μ_{2k} = |[H,σ^z_0]^{(k)}|2^2$ grow almost factorially, $μ{2k}^{1/(2k)}\sim k/\ln k$, and thus asymptotically at the maximal allowed rate. We generalize this result by resolving the operator norm in… |
J. Sirker | ArXiv |
| 2026-06-04 | TAM: Torque Adaptation Module for Robust Motion Transfer in Manipulation Dexterous Sim2RealA policy tuned for one robot often behaves differently on another, whether due to the sim-to-real gap, unknown payloads, or the differing dynamics of two instances of the same robot. In contact-rich, dynamic manipulation, even small motion discrepancies can result in failure to track reference motion, since they disrupt the timing and modes of contact. Common remedies, such as domain… |
Dieter Fox Team | ArXiv |
| 2026-06-04 | Towards Realistic 3D Sonar Simulation Sim2RealAs underwater robotics research increasingly addresses complex 3D perception and autonomous navigation, the fidelity of sonar simulation has become a key factor in algorithm development. Current simulation frameworks typically rely on geometry-driven rendering, approximating 3D sonar as an underwater equivalent to LiDAR, which fails to account for fundamental acoustic phenomena such as… |
Enrico Simetti Team | ArXiv |
| 2026-06-04 | LadderMan: Learning Humanoid Perceptive Ladder Climbing Sim2RealHumanoid robots hold great promise for operating in human-centered environments, yet ladder climbing remains one of the most challenging tasks due to sparse footholds and handholds, complex whole-body coordination, and sensitivity to perception and control errors. We present \textbf{LadderMan}, a unified system that enables humanoid robots to robustly climb diverse ladders and perform… |
Guanya Shi Team | ArXiv |
| 2026-06-03 | GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors Dexterous Sim2Real LearnedControlScaling humanoid loco-manipulation requires robot-compatible demonstrations across diverse objects, whole-body motions, and scene geometries, but teleoperation and motion capture are difficult to scale because each collection depends on physical setups, instrumented actors, and robot operation. We present GRAIL, a digital generation pipeline that remains fully virtual until deployment: it… |
Ye Yuan Team | ArXiv / Web |
| 2026-06-03 | M3imic: Learning a Versatile Whole-Body Controller for Multimodal Motion Mimicking Dexterous Sim2Real LearnedControlBuilding a general-purpose whole-body controller is essential for enabling diverse motion capabilities in humanoid robots across a wide range of downstream tasks, including locomotion and loco-manipulation. Different tasks rely on distinct motion reference modalities: locomotion primarily depends on coordinated robot joint trajectories, whereas manipulation requires precise end-effector… |
Shengbo Eben Li Team | ArXiv |
| 2026-06-03 | Generalization of World Models under Environmental Variability for Vision-based Quadrotor Navigation Sim2RealWorld models, learned generative models that predict how an environment evolves, have become a promising tool for sample-efficient robot learning. Yet how robust they are to environmental variability remains poorly understood. To address this, we conduct a systematic study using vision-based quadrotor navigation as a testbed problem, training DreamerV3-based world models under varying levels of… |
Kostas Alexis Team | ArXiv |
| 2026-06-03 | WAM-Nav: Asymmetric Latent World-Action Modeling for Unified Visual Navigation Sim2RealVisual navigation requires generating smooth and collision-free trajectories under complex geometric and physical constraints. Existing reactive policies that directly map observations to actions lack anticipatory reasoning, limiting their ability to proactively avoid obstacles. While visual imagination offers predictive foresight, conventional modular approaches separate scene prediction from… |
Nianfeng Liu Team | ArXiv |
| 2026-06-03 | MoDex: A Diffusion Policy for Sequential Multi-Object Dexterous Grasping Sim2RealThis work addresses sequentially grasping multiple objects with a single dexterous hand without releasing those already held. Most dexterous grasping methods commit all of the hand’s degrees of freedom to a single object, underutilizing its dexterity and leaving no redundancy for subsequent grasps. The proposed solution, MoDex, is a diffusion policy that predicts the next gripper pose directly… |
Danica Kragic Team | ArXiv |
| 2026-06-03 | Learning Manifold and Itô Dynamics with Branched Neural Rough Differential Equations Sim2RealNeural rough differential equations (NRDEs) stay accurate under irregular sampling while taking far fewer integration steps than standard neural differential equations, summarising a finely sampled driver by its log-signature and advancing the hidden state over coarse intervals using the log-ODE method. This efficiency rests on the shuffle algebra, the algebraic counterpart of Stratonovich… |
Andi Han Team | ArXiv |
| 2026-06-03 | OLIVE: Online Low-Rank Incremental Learning for Efficient Adaptive Exoskeletons Sim2RealWearable exoskeleton systems hold promise for restoring mobility in individuals with physical impairments, yet most existing controllers rely on static gait policies that lack the ability to adapt to dynamic real-world environments or individual user characteristics. We present \olive (\underline{O}nline \underline{L}ow-rank \underline{I}ncremental Learning for Efficient Adapti\underline{ve}… |
Ying Nian Wu Team | ArXiv |
| 2026-06-02 | GPU-Parallel Multi-Task Reinforcement Learning with Demonstration Guided Policy Optimization Manipulation Sim2RealLarge scale GPU-parallel reinforcement learning has changed what can be trained in robot simulation, yet most systems still optimize one specialist policy per task. We propose a construction methodology for turning structured manipulation task families into GPU-parallel multi-task RL benchmarks, and instantiate it as MT-Libero using LIBERO assets and task predicates in Isaac Lab. The resulting… |
Weihua Zhang Team | ArXiv |
| 2026-06-02 | Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation Sim2RealDeep reinforcement learning has shown strong potential for enabling autonomous robots to learn complex navigational tasks. However, its practical use still depends heavily on human designed reward functions and repeated manual fine tuning, which is time consuming and does not guarantee high success in the desired task. This paper presents AgenticRL, agent guided reinforcement learning framework… |
Dzmitry Tsetserukou Team | ArXiv |
| 2026-06-02 | Speech Emotion Recognition using Attention-based LSTM-Network with Residual Connection Sim2RealSpeech emotion recognition is an important component of modern human-computer interaction systems. However, many state-of-the-art approaches rely on large pretrained models with high computational and memory requirements, limiting their applicability. This paper proposes ResLSTM-SA, a lightweight architecture that integrates residual connections with soft attention within an LSTM-based framework…. |
Maxim Vashkevich Team | ArXiv |
| 2026-06-02 | SplitAdapter: Load-Aware Humanoid Loco-Manipulation via Factorized Adaptation Sim2Real LearnedControlHumanoid loco-manipulation requires stable whole-body control under varying object masses and pickup/placement heights. This becomes particularly challenging in sim-to-real transfer, where object-induced load variation and robot-side dynamics mismatch interact during physical contact. Existing history-based adapters often compress these factors into a single latent representation, which can… |
Donghan Koo Team | ArXiv |
| 2026-06-02 | AirDreamer: Generalist Drone Navigation with World Models Sim2RealNavigating a drone in unseen and cluttered environments requires reliable generalization to unseen scene layouts and understanding of environmental structure relative to the robot’s capabilities. Previous methods, which assume the same environment configuration, often rely heavily on human-designed perception pipelines and predefined rules to guide the robot toward the target. This process is… |
Guyue Zhou Team | ArXiv |
| 2026-06-02 | Reinforcement Learning from Cross-domain Videos with Video Prediction Model Sim2RealReinforcement learning from expert videos across visually distinct domains is challenging due to the absence of reward signals and the presence of domain gaps. We introduce XIPER (Cross-domain Video Prediction Reward), a reward model for learning from expert videos collected in a visually different domain, where the agent’s appearance differs due to factors such as color, morphology, or the… |
Vincent François-Lavet Team | ArXiv |
| 2026-06-02 | Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group Sim2RealA latent world model built from an equivariant encoder $E$ and an equivariant predictor $f$ inherits a provable symmetry of its training loss: when the world’s dynamics genuinely carries a group $G$ acting on latents by an orthogonal representation $ρ(g)$, the one-step prediction relMSE is exactly invariant across the whole group, so fitting the dynamics on a restricted slice of orientations… |
Hongbo Wang | ArXiv |
| 2026-06-02 | DLO-Lab: Benchmarking Deformable Linear Object Manipulations with Differentiable Physics Sim2RealWe address the challenge of enabling robots to manipulate deformable linear objects (DLOs), such as ropes, cables, and rubber bands. Prior work has primarily focused on narrow, task-specific problems, often relying on real-world demonstrations or handcrafted heuristics. Such approaches, however, struggle to scale to the wide variety of materials and tasks encountered in practice, and collecting… |
Chuang Gan Team | ArXiv / Web |
| 2026-06-01 | Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization Dexterous Sim2RealPrecise parametric control over circuit geometry is essential for semiconductor inspection, yet obtaining sufficient real training data remains costly. Although generative models such as diffusion models and Generative Adversarial Networks (GANs) can augment training data, they cannot guarantee the nanometer-scale geometric accuracy required for metrology tasks. We propose a visual program… |
Tatsuya Sasaki Team | ArXiv |
| 2026-06-01 | Symmetry-Aware 9D Pose Estimation with Sim(3)-Consistent Feature and Spherical Inception Convolution Dexterous Sim2RealObject pose estimation is a fundamental problem for an agent system to perceive or manipulate objects in images or videos. However, current instance-level methods struggle with generalization to unseen objects. Category-level methods seek to address this, but remain constrained by the complexities of learning in the non-linear Sim(3) space and intra-class variations. To address these challenges,… |
Naveed Akhtar Team | ArXiv |
| 2026-06-01 | A Simulation Platform for Flapping-Wing Vehicles Sim2RealFlapping-wing aerial vehicles (FWAVs) demonstrate remarkable agility but face substantial autonomy challenges due to their high sensitivity to aerodynamic disturbances and limited sensor payload capacity. Current simulation platforms typically rely on oversimplified laminar flow assumptions and idealized sensor models, failing to capture the complex turbulence patterns and perceptual limitations… |
Tomi Westerlund Team | ArXiv |
| 2026-06-01 | A combination of noise and bilateral filters achieve supralinear and scalable adversarial robustness in CNNs Sim2RealThe vulnerability of deep neural networks to adversarial examples poses a significant challenge for real-world deployment. Existing techniques to enhance deep network robustness rely on adversarial training, an approach that is powerful but computationally intensive and typically tailored to specific attack types. To address these limitations, existing works have explored techniques such as… |
Pau Vilimelis Aceituno Team | ArXiv |
| 2026-06-01 | PINNOCHIO: Physics-Informed Neural Network for Coupled Hyperelastic Interface-Volume Simulation in Orthognathic Surgery Sim2RealPredicting patient-specific facial soft-tissue deformation is critical for iterative orthognathic surgery planning. However, current computational methods face a strict accuracy-efficiency trade-off: high-fidelity Finite Element Methods (FEM) are computationally prohibitive, whereas pure deep learning models often produce biomechanically inconsistent results. While Physics-Informed Neural… |
Pingkun Yan Team | ArXiv |
| 2026-06-01 | SCOPE: Real-Time Natural Language Camera Agent at the Edge Sim2RealDeploying language-driven agents in robotics requires evaluations that reflect real-world task demands: natural-language instructions with reproducible outcomes. Such agents must connect language models to callable perception and control tools, and be assessed using deployment-critical metrics including latency, accuracy, and error modes. We present SCOPE (Simulation and Camera Operations for… |
Pragyana Mishra Team | ArXiv / Web |
| 2026-06-01 | RESCAST-100K: A Comprehensive Dataset for Cross-Domain Residential Load and Indoor Temperature Forecasting Sim2RealAccurate short-term forecasting of residential energy load and indoor temperature is essential for home energy management systems, grid-level demand response, and community energy efficiency efforts. Domain adaptation and transfer learning have shown promise for improving forecasting accuracy under data heterogeneity and scarcity commonly seen in residential settings. However, progress is limited… |
Simone Silvestri Team | ArXiv |
| 2026-05-31 | Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX Sim2RealHigh-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While aerial robotics simulators have evolved to support specialized needs such as fidelity, differentiability, and swarms independently, a unified platform that can synthesize data across all these domains is missing. In this work, we propose Crazyflow, a… |
Angela P. Schoellig Team | ArXiv |
| 2026-05-31 | Chirality-free photon routing via giant atoms in waveguide QED ladders Sim2RealIn this paper, we present an in-depth examination of single-photon routing in a multi-emitter waveguide quantum electrodynamics ladder with up to five giant atoms simultaneously coupled across two linear waveguides. Using a real-space approach, we analyze a non-chiral routing architecture and evaluate the impact of scaling the number of giant atoms in three topologically distinct configurations:… |
Imran M. Mirza Team | ArXiv |
| 2026-05-31 | PSF-like Alpha-Particle Events in LSST Images Sim2RealRare $α$-particle-induced charge clusters appear in LSST images as compact, PSF-like sources with a median FWHM of $0.!!^{\prime\prime}95$ and median ellipticity consistent with zero, closely resembling unresolved astrophysical point sources. These events are detected in both dark and science exposures at a rate of approximately $10^{-12}\ \mathrm{pixel}^{-1}\ \mathrm{s}^{-1}$. Their collected… |
Eli S. Rykoff Team | ArXiv |
| 2026-05-30 | Global-Local Attention Decomposition for Terrain Encoding in Humanoid Perceptive Locomotion Sim2RealAlthough reinforcement learning has significantly advanced humanoid locomotion, perceptive policies still struggle on sparse-foothold terrain and constrained environments. Success in these scenarios requires both broad terrain awareness and precise foothold selection, two perceptual roles that conventional encoders often entangle. To address this challenge, we propose Global-Local Attention… |
Yue Gao Team | ArXiv |
| 2026-05-30 | A Four-Tier Communication Architecture and Sim-to-Real Validation of a Graphical Open-Source Platform for Robotic Engineering Education Sim2RealThe persistent challenge in scaling authentic manipulator education within university laboratories is a structural dichotomy: commercial digital twins are often cost-prohibitive and rigidly scripted, whereas open-source robotics middleware (ROS) imposes steep technical and syntax barriers for novices. To resolve this logistical and educational friction, this Work-in-Progress (WiP) paper proposes… |
Jiong Jin Team | ArXiv |
| 2026-05-30 | Too Much of a Good Thing: When sim2real Efforts Impede Policy Learning (And What to Do About It) Sim2RealWhile sim2real efforts are necessary for effective policy transfer to hardware, there is such a thing as too much of a good thing. We argue that sim2real efforts have led to misaligned incentives with policy learning, resulting in simulator lock in and poor policy exploration due to the unreasonable constraints imposed by the real world. We offer a diagnosis and explanation of the current status… |
Luis Sentis Team | ArXiv |
| 2026-05-29 | RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning VLA Sim2RealVision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robot control. However, their performance remains fundamentally constrained by the availability of high-quality robot trajectory data. In current robot learning practice, such data are primarily collected through human teleoperation, which is labor-intensive, costly, and difficult to scale. In this paper,… |
Xinhai Sun Team | ArXiv |
| 2026-05-29 | Feat2Go: Visual Feature-Grounded Value Estimation for Embodied Reinforcement Learning VLA Sim2RealReinforcement learning is a promising approach for improving the capabilities of vision-language-action (VLA) models while avoiding the heavy data requirements of imitation learning. However, its effectiveness for VLA models is often constrained by sparse supervision and the difficulty of designing informative reward signals for long-horizon manipulation. In this work, we present Feat2Go, a… |
Yongtao Wang Team | ArXiv |
| 2026-05-29 | Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin Sim2Real TactileWe introduce and solve the novel task of controlled separation of small objects with two fingers of a multi-purpose robotic hand: after grasping into a box of small objects, the task is to drop as many of them until a desired number remains between the fingers. The objects are small compared to the width of the fingers but also in absolute terms. In our case little pellets with a diameter of only… |
Berthold Bäuml Team | ArXiv |
| 2026-05-29 | Batched Differentiable Rigid Body Dynamics in PyTorch for GPU-Accelerated Robot Learning Sim2RealAs robot control shifts toward large-scale reinforcement learning with in-loop dynamics computation, the community’s reliance on CPU-bound libraries such as Pinocchio creates a throughput bottleneck in GPU-based training pipelines. We present BARD (Batched Articulated Rigid-body Dynamics), a self-contained PyTorch implementation of Featherstone’s rigid-body dynamics algorithms, optimized for… |
Zhaoxing Li Team | ArXiv |
| 2026-05-29 | Dirac-Phase CP-Violation in the Low-Scale Type-I Seesaw with Three Right-Handed Neutrinos Sim2RealWe study the low-scale type-I seesaw with three right-handed neutrinos (i.e. heavy Majorana neutrinos) when the CP-violation arises solely from the low-energy Dirac phase $δ$ of the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) neutrino mixing matrix and the heavy neutrinos have testable mixings. We derive a CP-conserving and non-real structure of the $3\times 3$ orthogonal matrix entering the… |
S. T. Petcov Team | ArXiv |
| 2026-05-29 | Scaling Multi-Hop Training Data via Graph-Constrained Path Selection Sim2RealEndowing large language models with compositional reasoning over specialized documents requires multi-hop training data at scale, where such data rarely exists outside of curated benchmarks built on structured sources. To construct it directly from plain, unannotated text, existing methods ask a single teacher model to jointly discover an evidence path through a document and verbalize it as a… |
Yike Guo Team | ArXiv |
| 2026-05-29 | Robust class-gated single-pixel diffractive optical neural network with random-aberration-aware training Sim2RealOptical computing offers the theoretical potential for high-speed, energy-efficient inference, yet its practical deployment remains constrained by fundamental input-output bottlenecks, particularly the reliance on electronic sensors with limited frame rates and stringent alignment requirements between optical components. Here, we demonstrate an image-class-gated single-pixel DONN that overcomes… |
Jun-Jun Xiao Team | ArXiv |
| 2026-05-29 | TALON: Token-Aligned Lightweight Adapters for 6-DoF Spacecraft Pose Estimation Sim2RealMonocular 6-DoF spacecraft pose estimation methods predominantly process individual frames, discarding the temporal information present in an image sequence acquired during spacecraft manoeuvres. Few temporal approaches require full backbone fine-tuning or auxiliary optical flow networks, risking catastrophic forgetting or increasing computational cost, respectively. We propose TALON… |
Djamila Aouada Team | ArXiv |
| 2026-05-29 | QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer Sim2RealEstimating 3D attributes directly from images has advanced rapidly with the Visual Geometry Grounded Transformer (VGGT), which predicts camera parameters, depth maps, and point clouds in a single forward pass. However, its 1.2B-parameter scale severely limits deployment on resource-constrained platforms such as UAVs and mobile AR devices. To address this limitation, we introduce QVGGT, a tailored… |
Huan Wang Team | ArXiv / Web |
| 2026-05-29 | Imaging the Magnetically Driven Reconstruction of the Electronic States in the Antiferromagnetic Topological Insulator EuSn$_2$As$_2$ Sim2RealThe realization of the axion insulator phase in magnetic topological insulators is often hindered by crystalline symmetries that protect gapless surface states, even when time-reversal symmetry is broken. Here, we use variable-temperature scanning tunneling microscopy (STM) and spectroscopy (STS), complemented with density functional theory (DFT), to investigate the local electronic structure of… |
Pegor Aynajian Team | ArXiv |
| 2026-05-29 | DRL-Based Pose Control for Double-Ackermann Robots Under Actuation Uncertainties Sim2RealRobust deployment of deep reinforcement learning (DRL) policies on real robots remains challenging due to discrepancies between simulation and real-world dynamics. We address this issue in the context of maneuvering with double-Ackermann-steering mobile robots, which introduce additional constraints due to their non-holonomic nature. Building upon the DRL framework ManeuverNet, we extend its… |
Olivier Ly Team | ArXiv |
| 2026-05-28 | YoCausal: How Far is Video Generation from World Model? A Causality Perspective HF-Hot Sim2RealAs video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly rely on synthetic data, limiting real-world generalization due to the sim-to-real gap. We present YoCausal, a two-level benchmark inspired by the Violation of Expectation (VoE) paradigm from… |
Zhixiang Wang Team | ArXiv |
| 2026-05-28 | VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation Manipulation Sim2RealWhen using reinforcement learning (RL) for contact-rich robotic manipulation, vision can provide task-relevant information that accelerates learning beyond what proprioception alone can achieve. However, vision-enabled policies tend to overfit to the visual conditions seen during training, limiting their robustness and transferability. We present a human-in-the-loop RL framework that employs… |
Dongheui Lee Team | ArXiv |
| 2026-05-28 | PhAIL: A Real-Robot VLA Benchmark and Distributional Methodology VLA Sim2RealReal-world evaluation of vision-language-action (VLA) policies still rests on binary success rate at a fixed timeout with $N \le 25$ rollouts per condition, almost always without confidence intervals or paired statistical comparison; these cohort sizes struggle to resolve close comparisons reliably. We introduce PhAIL (Physical AI Leaderboard, https://phail.ai), an open real-robot benchmark on a… |
Sergey Arkhangelskiy | ArXiv / Web |
| 2026-05-28 | Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes Sim2Real HF-HotIndustrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under… |
Seung-Kyum Choi Team | ArXiv |
| 2026-05-28 | CoMo3R-SLAM: Collaborative Monocular Dense SLAM with Learned 3D Reconstruction Priors for Outdoor Multi-Agent Systems Sim2RealCollaborative dense SLAM is essential for multi-robot teams to achieve scalable and consistent 3D perception across large-scale outdoor environments. Existing systems typically depend on depth sensors, incurring significant payload, power, and calibration costs. Monocular RGB cameras are a lightweight alternative, but collaborative monocular dense SLAM remains difficult due to scale ambiguity,… |
Baoru Huang Team | ArXiv |
| 2026-05-28 | Battery-Sim-Agent: Leveraging LLM-Agent for Inverse Battery Parameter Estimation Sim2RealParameterizing high-fidelity “digital twins” of batteries is a critical yet challenging inverse problem that hinders the pace of battery innovation. Prevailing methods formulate this as a black-box optimization (BBO) task, employing algorithms that are sample-inefficient and blind to the underlying physics. In this work, we introduce a new paradigm that reframes the inverse problem as a reasoning… |
Jiang Bian Team | ArXiv |
| 2026-05-27 | Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation Dexterous Manipulation Sim2RealA primary bottleneck in contact-rich manipulation is the difficulty of collecting real-world data. Sim-to-real reinforcement learning offers a scalable alternative, but the simulation-reality gap prevents information-dense modalities like touch from being effectively used. Existing sim-to-real methods often mitigate this gap by simplifying tactile data into coarse low-dimensional features –… |
Toru Lin Team | ArXiv / Web |
| 2026-05-27 | The Best-Laid SCHEMEs: Coordinated Sabotage and Monitoring in Multi-Agent Systems Sim2RealAs agentic coding systems decompose work across multiple model instances, a critical safety question is whether those instances can coordinate to achieve a hidden malicious objective while remaining aligned with user intent. We introduce SCHEME, a benchmark of 17 task instances across 7 settings and 8 real open-source libraries, each pairing a legitimate software-engineering task with a covert… |
Pablo Bernabeu-Pérez Team | ArXiv |
| 2026-05-27 | Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics Sim2RealEvent-driven scheduling policies are increasingly deployed in industrial environments, where decisions are made under asynchronous and partially observed system states. As a result, decision states are not temporally consistent, action admissibility is not explicitly defined, and the origin of execution errors remains ambiguous. These issues limit both reliability and interpretability. To… |
Noah Klarmann Team | ArXiv |
| 2026-05-27 | SPRINT: Efficient Spectral Priors for Humanoid Athletic Sprints Sim2RealThe pursuit of humanoid athletic sprints is hindered by a scarcity of humanoid-viable kinematic reference data and the inability of existing frameworks to maintain stability during sprints. To overcome these limitations, we introduce SPRINT, a novel framework driven by efficient, frequency-adaptive spectral priors. By characterizing the fundamental periodicity of human locomotion in the frequency… |
Huimin Lu Team | ArXiv |
| 2026-05-27 | POINav: Benchmarking and Enhancing Final-Meters Arrival in Real-World Vision-Language Navigation Sim2RealReal-world navigation is fundamentally driven by Points of Interest (POIs), yet reaching a precise POI remains a critical “final-meters” challenge. Existing Vision-Language Navigation (VLN) benchmarks of POI-goal navigation often suffer from coarse granularity or significant sim-to-real gaps due to generated scene. To bridge this gap, we present POINav-Bench, the first benchmark designed for… |
Mu Xu Team | ArXiv |
| 2026-05-26 | Transferable Reinforcement Learning via Probabilistic Latent Embeddings and Dynamic Policy Adaptation for Sim-to-Real Deployment Sim2RealDue to limited resources and public safety concerns, deep reinforcement learning (RL) agents for many cyber-physical systems (e.g., autonomous vehicles) are first trained in simulators. However, when deployed in real world environments, they often suffer from performance degradation or safety violations because of the inevitable Sim2Real gap. Existing zero-shot approaches, such as robust safe RL… |
Yiheng Feng Team | ArXiv |
| 2026-05-25 | VesselSim: learning 3D blood vessel segmentation without expert annotations Sim2RealBlood vessel segmentation is a core task in medical image analysis for the care of vascular diseases and surgical planning, yet the challenges of providing expert vascular annotations pose a major obstacle for the progress of related deep learning techniques. To address this, we propose VesselSim, a two-stage framework for universal 3D blood vessel segmentation that eliminates the need for real… |
Yiming Xiao Team | ArXiv |
| 2026-05-24 | MuJoCoUni:Persistent Batched Runtime Primitives for MuJoCo Sim2RealWe present MuJoCoUni, a downstream MuJoCo distribution for online robot learning and batched physics evaluation. Alongside the open-loop batched trajectory generation already provided by upstream mujoco.rollout, MuJoCoUni supplies runtime primitives for stateful environment execution. The target workloads need high-throughput parallel execution while retaining upstream CPU MuJoCo semantics for… |
Junzhe Wu Team | ArXiv |
| 2026-05-23 | Vision-Guided Outdoor Flight and Obstacle Evasion via Reinforcement Learning Sim2RealAlthough quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous pilot control in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles… |
Avideh Zakhor Team | ArXiv |
| 2026-05-22 | Robotic Strawberry Harvesting with Robust Vision and Deep Reinforcement Learning based Sim-to-Real Control Sim2RealThis study presents a closed-loop robotic strawberry harvesting system that combines a robust vision module, simulation-trained deep reinforcement learning (DRL) control, and ROS-based realrobot execution. For perception, we propose HRAttnEdge-YOLO26-seg, a modified YOLO26-seg architecture that incorporates a high-resolution P2 branch, segmentation-path attention, and edgesupervised prototype… |
Azlan Zahid Team | ArXiv |
| 2026-05-21 | CoRMA: Contrastive RMA for Contact-Rich Meta-Adaptation Sim2RealWe present CoRMA(Contrastive Robotic Motor Adaptation), a context-based meta-adaptation framework that modifies RMA for force-dominant assembly. CoRMA replaces raw simulator-parameter adaptation with a compact 6D simulator-only semantic contact context describing contact onset, lateral engagement, guided transition, contact direction, and jamming. A deployable causal Transformer adapter infers… |
Jianqiao Zhu Team | ArXiv |
| 2026-05-15 | DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo HF-Hot Sim2RealAchieving human-level manipulation requires dexterous robotic hands capable of complex object interactions. Advancing such capabilities further demands standardized benchmarks for systematic evaluation. However, existing dexterous benchmarks lack tasks that reflect the unique manipulation capabilities of dexterous hands over parallel grippers, as well as comprehensive evaluation pipelines. In… |
Tieniu Tan Team | ArXiv |
| 2026-05-15 | Adaptive Outer-Loop Control of Quadrotors via Reinforcement Learning Sim2RealDeep Reinforcement Learning (DRL) for quadrotor flight control typically relies on Domain Randomization (DR) for sim-to-real transfer, resulting in overly conservative policies that struggle with dynamic disturbances. To overcome this, we propose a novel adaptive control architecture that actively perceives and reacts to instantaneous perturbations. First, we train an optimal outer-loop policy,… |
Moble Benedict Team | ArXiv |
| 2026-05-15 | parallelcbf: A composable safety-filter and auditability framework for tensor-parallel reinforcement learning Sim2RealWhile Isaac Lab provides massive parallel UAV simulation, OmniSafe and safe-control-gym provide constrained-RL benchmarks, and CBFKit provides control-barrier-function synthesis tooling, no existing framework unifies these capabilities for end-to-end safety-constrained training. ParallelCBF is the first framework to unify (i)~tensor-parallel UAV environments, (ii)~hard-gate CBF safety filters,… |
Yuyin Ma Team | ArXiv |
| 2026-05-12 | NavOL: Navigation Policy with Online Imitation Learning Manipulation Sim2RealLearning robust navigation policies remains a core challenge in robotics. Offline imitation learning suffers from distribution shift and compounding errors at rollout, while reinforcement learning requires reward engineering and learns inefficiently. In this paper, we propose NavOL, an online imitation learning paradigm that interacts with a simulator and updates itself using expert… |
Li Zhang Team | ArXiv / Web |
| 2026-05-12 | When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents Sim2RealTool-use language agents are evaluated on benchmarks that assume clean inputs, unambiguous tool registries, and reliable APIs. Real deployments violate all these assumptions: user typos propagate into hallucinated tool names, a misconfigured request timeout can stall an agent indefinitely, and duplicate tool names across servers can freeze an SDK. We study these failures as a sim-to-real gap in… |
Xiyang Hu Team | ArXiv / Web |
| 2026-05-10 | Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching Dexterous Manipulation Sim2Real HF-HotDexterous manipulation is physics-intensive and highly sensitive to modeling errors and perception noise, making sim-to-real transfer prohibitively challenging. Domain randomization (DR) is commonly used to improve the robustness of learned policies for such tasks, but conventional DR randomizes one instance per episode, offering very limited exposure to the variability of real-world dynamics. To… |
Kaiyu Hang Team | ArXiv |
| 2026-05-08 | TimeLesSeg: Unified Contrast-Agnostic Cross-Sectional and Longitudinal MS Lesion Segmentation via a Stochastic Generative Model Sim2RealMultiple sclerosis (MS) expresses substantial clinical and radiological heterogeneity, which poses significant challenges for automatic lesion segmentation. The current deep learning-based SOTA is highly susceptible to changes in both distribution, e.g., changes in scanner; as well as the structure of inputs, evident in the current divide between cross-sectional and longitudinal approaches. We… |
Ferran Prados Team | ArXiv |
| 2026-05-08 | SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics Sim2RealAutonomous-driving simulators typically trade physical fidelity for scalable parallelism. Physics-based platforms such as CARLA and MetaDrive provide articulated vehicle dynamics and contact, but their non-vectorized interfaces make batched training difficult. GPU-batched systems such as Waymax and GPUDrive scale to hundreds of scenarios by replacing rigid-body physics with simplified kinematic… |
Zilin Bian Team | ArXiv |
| 2026-05-05 | On Surprising Effects of Risk-Aware Domain Randomization for Contact-Rich Sampling-based Predictive Control Sim2RealDomain randomization (DR) is widely used in policy learning to improve robustness to modeling error, but remains underexplored in contact-rich sampling-based predictive control (SPC), where rollout quality is highly sensitive to uncertainty. In this work, we take the first step by studying risk-aware DR in predictive sampling on a simple yet representative Push-T task, comparing average,… |
Aaron D. Ames Team | ArXiv |
| 2026-05-04 | SIAM: Head and Brain MRI Segmentation from Few High-Quality Templates via Synthetic Training Sim2RealSynthetic training has recently advanced brain MRI segmentation by enabling contrast-agnostic models trained entirely on generated data. However, most existing approaches rely on hundreds of automatically labeled templates, introducing systematic biases and limiting their flexibility to incorporate new anatomical structures. We present the Segment It All Model (SIAM), a 3D whole-head segmentation… |
Reuben Dorent Team | ArXiv |
| 2026-05-04 | Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture Sim2RealAutonomous surface vessels for floating-waste removal operate under varying hydrodynamics, external disturbances, and challenging water-surface perception. We present a field-validated system that combines camera-based polarimetric perception with a lightweight DRL-based controller for floating-waste detection and capture. Camera detections are converted into water-surface target points and… |
Cédric Pradalier Team | ArXiv |
| 2026-05-03 | DexSim2Real: Foundation Model-Guided Sim-to-Real Transfer for Generalizable Dexterous Manipulation Dexterous Sim2RealSim-to-real transfer remains a critical bottleneck for deploying dexterous manipulation policies learned in simulation to real-world robots. Existing approaches rely on manually designed domain randomization or task-specific adaptation, limiting their generalizability across diverse manipulation scenarios. We present DexSim2Real, an integrated framework that leverages vision-language foundation… |
Yuhao Liao Team | ArXiv |
| 2026-05-01 | Learning to Race in Minutes: Infoprop Dyna on the Mini Wheelbot Sim2RealReinforcement Learning (RL) has the potential to enable robots with fast, nonlinear, and unstable dynamics to reach the limits of their performance. However, most recent advances rely on carefully designed physics-based simulators and domain randomization to achieve successful sim-to-real transfer within reasonable wall-clock time. In this work, we bypass the need for such simulators and… |
Sebastian Trimpe Team | ArXiv |
| 2026-04-27 | AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation Sim2RealWhile Vision-Language-Action (VLA) models have been demonstrated possessing strong zero-shot generalization for robot control, their massive parameter sizes typically necessitate cloud-based deployment. However, cloud deployment introduces network jitter and inference latency, which can induce severe spatiotemporal misalignment in mobile navigation under continuous displacement, so that the stale… |
Mu Xu Team | ArXiv |
| 2026-04-26 | Unleashing the Agility of Wheeled-Legged Robots for High-Dynamic Reflexive Obstacle Evasion Sim2RealWheeled-legged robots combine the energy efficiency of wheeled locomotion with the terrain adaptability of legged systems, making them promising platforms for agile mobility in complex and dynamic environments. However, enabling high-dynamic reflexive evasion against fast-moving obstacles remains challenging due to the hybrid morphology, mode coupling, and non-holonomic constraints of such… |
Ce Hao Team | ArXiv |
| 2026-04-25 | GreenDyGNN: Runtime-Adaptive Energy-Efficient Communication for Distributed GNN Training Sim2RealDistributed GNN training is dominated by remote feature fetching, which can be very costly. Multi-hop neighborhood sampling crosses partition boundaries and triggers fine-grained RPCs whose fixed initiation cost and GPU-stall latency waste energy. Prior systems try to reduce this overhead with presampling and static caching, but cache policies cannot react to runtime network variation. We show… |
M. S. Q. Zulkar Nine Team | ArXiv |
| 2026-04-23 | Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection Sim2RealAutomating the detection of regulatory compliance remains a challenging task due to the complexity and variability of legal texts. Models trained on one regulation often fail to generalise to others. This limitation underscores the need for principled methods to improve cross-domain transfer. We study data selection as a strategy to mitigate negative transfer in compliance detection framed as a… |
Dusica Marijan Team | ArXiv |