LearnedControl

Updated on 2026.06.11

Usage instructions: here

LearnedControl

Publish Date	Title & Abstract	Authors	Links
2026-06-09	OMG: Omni-Modal Motion Generation for Generalist Humanoid Control `LearnedControl` Humanoid whole-body control has made significant progress in recent years, yet existing approaches remain limited to few-skill policies with heavy reward engineering, or motion trackers that are difficult to extend to new input modalities. We argue that the key to general-purpose humanoid control is to build a scalable brain, a module capable of reasoning with diverse conditioning modalities,…	Hang Zhao Team	ArXiv / Web
2026-06-08	MotionWAM: Towards Foundation World Action Models for Real-Time Humanoid Loco-Manipulation `VLA` `LearnedControl` World Action Models (WAMs) couple a video dynamics prior to the policy and have shown encouraging results on tabletop manipulation, but iterative denoising over high-dimensional video-action latents leaves them too slow for real-time humanoid loco-manipulation. The problem is compounded by the dominant hierarchical paradigm, in which a high-level manipulation policy controls only the upper body…	Junwei Liang Team	ArXiv
2026-06-07	OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation `Sim2Real` `LearnedControl` Recent progress in robot manipulation has been largely driven by learning from large-scale demonstrations. For humanoid robot loco-manipulation tasks, however, existing data sources force an unsatisfying tradeoff between trajectory quality and scalability. Real-world teleoperation provides the highest-quality trajectories but requires dedicated physical space and time-consuming scene resets….	Xuelong Li Team	ArXiv / Web
2026-06-06	SIMPLE: Simulation-Based Policy Learning and Evaluation for Humanoid Loco-manipulation `LearnedControl` Humanoid foundation models are advancing faster than we can evaluate them. While real-world testing is expensive and difficult to reproduce, existing simulation benchmarks focus primarily on table-top or wheeled robots. A scalable and reproducible benchmark for whole-body humanoid loco-manipulation remains an open problem. To this end, we present SIMPLE, a unified simulation testbed for humanoid…	Yue Wang Team	ArXiv
2026-06-06	Mind Your Steps: A General Learning Framework for Accurate Humanoid Foothold Tracking `LearnedControl` Enabling humanoid robots to operate in complex, dynamic environments remains a critical challenge, fundamentally limited by the ability to navigate robustly, safely, and accurately. While reinforcement learning with velocity-commanded policies has achieved remarkable robustness in humanoid locomotion, this approach lacks explicit control of the foothold placement, leading to unsafe behavior, such…	Jan Peters Team	ArXiv
2026-06-06	Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain `LearnedControl` Humanoid behavior foundation models aim to acquire reusable whole-body control policies from broad human motion priors, enabling a single controller to produce diverse and expressive behaviors. However, existing motion-centric foundation policies largely assume that the reference motion is already physically compatible with the robot’s surroundings. This assumption breaks when the demonstrator,…	Junwei Liang Team	ArXiv
2026-06-06	X-OP: Cross-Morphology Whole-Body Teleoperation via MPC Retargeting `LearnedControl` Whole-body teleoperation is essential for scalable robot data collection in loco-manipulation tasks, yet existing approaches relying on exoskeleton suits or multi-camera setups impose prohibitive cost, complexity, and environmental constraints. Recent methods using a single extended reality (XR) device with end-to-end reinforcement learning policies partially address these limitations but require…	Nicholas Morozovsky Team	ArXiv
2026-06-04	HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers `Dexterous` `LearnedControl` For a humanoid robot to be deployed in the real world, the choice of command space (i.e., the interface between task planning and whole-body control) is crucial. Existing whole-body controllers typically demand dense kinematic or spatial references that planners struggle to synthesize from task semantics. We instead propose a compact, explicit interface that is intuitive, general, modular, and…	Aaron Ames Team	ArXiv
2026-06-04	MotionDisco: Motion Discovery for Extreme Humanoid Loco-Manipulation `Dexterous` `Manipulation` `LearnedControl` We present MotionDisco, a framework that discovers contact-rich, long-horizon humanoid loco-manipulation motions from scratch, without relying on teleoperation or motion retargeting from human demonstrations. This is challenging because the space of possible contact interactions grows combinatorially with the task horizon and the number of objects in the scene. MotionDisco enables rapid discovery…	Majid Khadiv Team	ArXiv
2026-06-03	GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors `Dexterous` `Sim2Real` `LearnedControl` Scaling humanoid loco-manipulation requires robot-compatible demonstrations across diverse objects, whole-body motions, and scene geometries, but teleoperation and motion capture are difficult to scale because each collection depends on physical setups, instrumented actors, and robot operation. We present GRAIL, a digital generation pipeline that remains fully virtual until deployment: it…	Ye Yuan Team	ArXiv / Web
2026-06-03	M3imic: Learning a Versatile Whole-Body Controller for Multimodal Motion Mimicking `Dexterous` `Sim2Real` `LearnedControl` Building a general-purpose whole-body controller is essential for enabling diverse motion capabilities in humanoid robots across a wide range of downstream tasks, including locomotion and loco-manipulation. Different tasks rely on distinct motion reference modalities: locomotion primarily depends on coordinated robot joint trajectories, whereas manipulation requires precise end-effector…	Shengbo Eben Li Team	ArXiv
2026-06-02	SplitAdapter: Load-Aware Humanoid Loco-Manipulation via Factorized Adaptation `Sim2Real` `LearnedControl` Humanoid loco-manipulation requires stable whole-body control under varying object masses and pickup/placement heights. This becomes particularly challenging in sim-to-real transfer, where object-induced load variation and robot-side dynamics mismatch interact during physical contact. Existing history-based adapters often compress these factors into a single latent representation, which can…	Donghan Koo Team	ArXiv
2026-06-02	Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking `LearnedControl` We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility-generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings. Scaling both data and…	Li Yi Team	ArXiv
2026-06-02	Bionic Human-Motion Style Transfer for Physically Executable Whole-Body Control of Humanoid Robots `LearnedControl` Expressive whole-body motion is important for humanoid robots operating in human environments, where robots are expected to move stably while presenting readable and adjustable body behaviors. However, most expressive motions are still obtained from fixed demonstrations or manually designed scripts, making it difficult to reuse a demonstrated style across different motion contents. Inspired by…	Shiwu Zhang Team	ArXiv / Web
2026-06-02	Dual Advantage Fields `LearnedControl` Offline goal-conditioned reinforcement learning requires both long-horizon reachability estimates and local action comparisons. Dual goal representations provide value fields that capture global goal reachability, but they do not directly specify which action should be preferred at a given state. We propose Dual Advantage Fields, a policy-extraction method that turns a bilinear dual value model…	Arip Asadulaev Team	ArXiv
2026-05-31	LEGS: Fine-Tuning Teleop-Free VLAs for Humanoid Loco-manipulation in an Embodied Gaussian Splatting World `VLA` `LearnedControl` Training vision-language-action (VLA) policies for humanoid loco-manipulation is constrained by the high cost and complexity of collecting human teleoperation demonstrations. VLA policies fine-tuned in simulators have, until now, failed to transfer effectively in humanoid loco-manipulation tasks. We present LEGS (Loco-manipulation via Embodied Gaussian Splatting), a hybrid simulator that…	Mac Schwager Team	ArXiv / Web
2026-05-31	S2M-Trek: From Single to Multi-Sphere Transport via Per-Frame Deep Sets on a Wheel-Legged Robot `LearnedControl` We study the problem of scaling dynamic loco-manipulation from a single free-rolling sphere to multiple spheres transported simultaneously on the back of a wheel-legged quadruped, without fences, grippers, or mechanical stops. Multiple identical free-rolling spheres form an unordered set with no persistent identity: their ordering may change independently at each history frame, creating a…	Yiqun Li Team	ArXiv
2026-05-29	Learning Terrain-Aware Whole-Body Control for Perceptive Legged Loco-Manipulation `LearnedControl` Legged manipulators integrate exceptional terrain adaptability along with mobile manipulation capabilities, which make them highly promising for deployment in human-centric environments. By coordinating the control of both legs and arms, a whole-body controller can significantly expand the operational workspace of legged manipulators. However, many existing whole-body controllers primarily depend…	Jun Ma Team	ArXiv
2026-05-29	HOIST: Humanoid Optimization with Imitation and Sample-efficient Tuning for Manipulating Suspended Loads `VLA` `LearnedControl` Manipulating suspended payloads with humanoid robots is challenging because the robot can only influence an underactuated, oscillatory load through whole-body motion and intermittent contact. Imitation learning provides safe initial behavior but does not directly optimize final placement, while reinforcement learning from scratch is unsafe and sample-inefficient on real humanoids. We present…	Shuai Li Team	ArXiv
2026-05-28	World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications `LearnedControl` World models, internal simulators that learn the structure and dynamics of an environment, have emerged as a central paradigm in the pursuit of artificial general intelligence, enabling agents to predict, plan, and reason within learned representations. Despite rapid progress across reinforcement learning, robotics, autonomous driving, and video generation, the field lacks a unified framework…	Wei Zhang Team	ArXiv
2026-05-26	HumanoidMimicGen: Data Generation for Loco-Manipulation via Whole-Body Planning `LearnedControl` Imitation learning is a promising approach for training humanoid robots to both walk and manipulate, but it requires a large number of demonstrations, which are time-intensive and difficult to collect via teleoperation. Existing data-generation algorithms can automatically synthesize demonstrations for manipulators, but they are ineffective on humanoids because their high-dimensional composite…	Yuke Zhu Team	ArXiv / Web
2026-05-25	Safety-Critical Whole-Body Control for Humanoid Robots via Input-to-State Safe Control Barrier Functions `LearnedControl` Safety-critical control is essential for humanoid robots operating in complex human-centered environments, where physical safety constraints such as joint limits, self-collision avoidance, obstacle avoidance, and workspace boundaries must be satisfied during real-robot operation. However, existing approaches remain limited because kinematic safety guarantees can be degraded in the presence of…	Jaeheung Park Team	ArXiv
2026-05-25	FOUND-IT: Foundation-model-first Task-driven 3D Scene Graphs with Granularity on Demand `LearnedControl` We present the first approach to build hierarchical task-driven 3D scene graphs of arbitrary indoor or outdoor environments using an uncalibrated monocular camera in real-time. We leverage geometric foundation models to estimate geometric attributes of the scene graph (e.g., object bounding boxes), but we also observe that traversability information (the “places” layer of a scene graph) can be…	Luca Carlone Team	ArXiv
2026-05-22	Any2Any: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking `LearnedControl` Whole-body tracking (WBT) models have become a key foundation for humanoid robots, enabling them to imitate diverse motions with high fidelity. Training such models from scratch requires large-scale data and computation, making rapid deployment on new humanoid platforms costly. This raises a natural question: Can pretrained WBT models transfer across embodiments with minimal adaptation? To answer…	Hua Chen Team	ArXiv
2026-05-20	Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum `LearnedControl` In this paper, we explore spatial-aware humanoid whole-body manipulation task. Compared with tabletop settings, this task poses two key challenges: 1) Spatial understanding is challenging in complex 3D environments with diverse spatial relations. 2) Action generation is difficult to generalize, as limited and costly real-robot data restricts data-driven models generalization. To address these…	Wei-Shi Zheng Team	ArXiv / Web
2026-05-19	CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation `LearnedControl` Humanoid robots have achieved impressive locomotion performance, yet contact-rich and long-horizon manipulation remains a major bottleneck. Manipulation is inherently contact-rich and demands compliant whole-body control for stable interaction, while its diversity and long-horizon nature favor modular, planner-compatible interfaces over joint-space tracking. We propose CEER, a compliant…	Xianyi Cheng Team	ArXiv / Web
2026-05-19	SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework `LearnedControl` Building humanoid robots capable of generalizable whole-body loco-manipulation in the real world remains a fundamental challenge. Existing methods either rely on laborious task-specific reward engineering, rigidly replay reference motions that fail to generalize, or depend on costly teleoperation that limits scalability. While human videos capture diverse human behaviors, motion priors inferred…	Hao Dong Team	ArXiv / Web
2026-05-17	HCLM: A Hierarchical Framework for Cooperative Loco-Manipulation with Dual Quadrupeds `LearnedControl` We introduce HCLM, a hierarchical framework for general-purpose cooperative loco-manipulation with dual quadrupedal systems. Coordinating multi-robot collaborative manipulation across floating bases is highly challenging due to the conflicting demands of spatial coordination, robust locomotion, and closed-chain physical interactions. To resolve this, our architecture systematically decouples…	Xinlei Chen Team	ArXiv
2026-05-15	Learning Dynamic Pick-and-Place for a Legged Manipulator `LearnedControl` Legged manipulators extend robotic capabilities beyond static manipulation by integrating agile locomotion with versatile arm control. However, achieving precise manipulation while maintaining coordinated locomotion remains a major challenge. This work presents a hierarchical reinforcement learning framework for dynamic pick-and-place tasks using a quadruped equipped with a 6-DOF robotic arm. The…	Jemin Hwangbo Team	ArXiv
2026-05-14	Before the Body Moves: Learning Anticipatory Joint Intent for Language-Conditioned Humanoid Control `LearnedControl` Natural language is an intuitive interface for humanoid robots, yet streaming whole-body control requires control representations that are executable now and anticipatory of future physical transitions. Existing language-conditioned humanoid systems typically generate kinematic references that a low-level tracker must repair reactively, or use latent/action policies whose outputs do not…	Yutao Yue Team	ArXiv
2026-05-05	BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation `LearnedControl` High-quality data collection is a fundamental cornerstone for training humanoid whole-body visuomotor policies. Current data acquisition paradigms predominantly rely on robot teleoperation, which is often hindered by limited hardware accessibility and low operational efficiency. Inspired by the Universal Manipulation Interface (UMI), we propose BifrostUMI, a portable, efficient, and robot-free…	Shaqi Luo Team	ArXiv
2026-05-05	SigLoMa: Learning Open-World Quadrupedal Loco-Manipulation from Ego-Centric Vision `LearnedControl` Designing an open-world quadrupedal loco-manipulation system is highly challenging. Traditional reinforcement learning frameworks utilizing exteroception often suffer from extreme sample inefficiency and massive sim-to-real gaps. Furthermore, the inherent latency of visual tracking fundamentally conflicts with the high-frequency demands of precise floating-base control. Consequently, existing…	Debing Zhang Team	ArXiv / Web
2026-05-02	VOFA: Visual Object Goal Pushing with Force-Adaptive Control for Humanoids `LearnedControl` The ability to push large objects in a goal-directed manner using onboard egocentric perception is an essential skill for humanoid robots to perform complex tasks such as material handling in warehouses. To robustly manipulate heavy objects to arbitrary goal configurations, the robot must cope with unknown object mass and ground friction, noisy onboard perception, and actuation errors; all in a…	Joydeep Biswas Team	ArXiv
2026-04-29	Learning Tactile-Aware Quadrupedal Loco-Manipulation Policies `Dexterous` `Tactile` `LearnedControl` Quadrupedal loco-manipulation is commonly built on visual perception and proprioception. Yet reliable contact-rich manipulation remains difficult: vision and proprioception alone cannot resolve uncertain, evolving interactions with the environment. Tactile sensing offers direct contact observability, but scalable tactile-aware learning framework for quadrupedal loco-manipulation is still…	Yu She Team	ArXiv
2026-04-23	X2-N: A Transformable Wheel-legged Humanoid Robot with Dual-mode Locomotion and Manipulation `Dexterous` `LearnedControl` Wheel-legged robots combine the efficiency of wheeled locomotion with the versatility of legged systems, enabling rapid traversal over both continuous and discrete terrains. However, conventional designs typically employ fixed wheels as feet and limited degrees of freedom (DoFs) at the hips, resulting in reduced stability and mobility during legged locomotion compared to humanoids with flat feet….	Ling Shi Team	ArXiv
2026-04-23	RPG: Robust Policy Gating for Smooth Multi-Skill Transitions in Humanoid Fighting `LearnedControl` Humanoid robots have demonstrated impressive motor skills in a wide range of tasks, yet whole-body control for humanlike long-time, dynamic fighting remains particularly challenging due to the stringent requirements on agility and stability. While imitation learning enables robots to execute human-like fighting skills, existing approaches often rely on switching among multiple single-skill…	Dong Wang Team	ArXiv
2026-04-23	Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot `LearnedControl` The integration of imitation and reinforcement learning has enabled remarkable advances in humanoid whole-body control, facilitating diverse human-like behaviors. However, research on environment-dependent motions remains limited. Existing methods typically enforce rigid trajectory tracking while neglecting physical interactions with the environment. We observe that humans naturally exploit a…	Xuelong Li Team	ArXiv
2026-04-20	SynAgent: Generalizable Cooperative Humanoid Manipulation via Solo-to-Cooperative Agent Synergy `LearnedControl` Controllable cooperative humanoid manipulation is a fundamental yet challenging problem for embodied intelligence, due to severe data scarcity, complexities in multi-agent coordination, and limited generalization across objects. In this paper, we present SynAgent, a unified framework that enables scalable and physically plausible cooperative manipulation by leveraging Solo-to-Cooperative Agent…	Jinhui Tang Team	ArXiv
2026-04-19	A Rapid Deployment Pipeline for Autonomous Humanoid Grasping Based on Foundation Models `LearnedControl` Deploying a humanoid robot to manipulate a new object has traditionally required one to two days of effort: data collection, manual annotation, 3D model acquisition, and model training. This paper presents an end-to-end rapid deployment pipeline that integrates three foundation-model components to shorten the onboarding cycle for a new object to approximately 30 minutes: (i) Roboflow-based…	Linqi Ye Team	ArXiv
2026-04-16	Switch: Learning Agile Skills Switching for Humanoid Robots `LearnedControl` Recent advancements in whole-body control through deep reinforcement learning have enabled humanoid robots to achieve remarkable progress in real-world chal lenging locomotion skills. However, existing approaches often struggle with flexible transitions between distinct skills, cre ating safety concerns and practical limitations. To address this challenge, we introduce a hierarchical multi-skill…	Ping Tan Team	ArXiv
2026-04-14	Learning Versatile Humanoid Manipulation with Touch Dreaming `Dexterous` `LearnedControl` `Tactile` Humanoid robots promise general-purpose assistance, yet real-world humanoid loco-manipulation remains challenging because it requires whole-body stability, dexterous hands, and contact-aware perception under frequent contact changes. In this work, we study dexterous, contact-rich humanoid loco-manipulation. We first develop an RL-based whole-body controller that provides stable lower-body and…	Ding Zhao Team	ArXiv
2026-04-14	FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators `Dexterous` `Tactile` `LearnedControl` Fast grasping is critical for mobile robots in logistics, manufacturing, and service applications. Existing methods face fundamental challenges in impact stabilization under high-speed motion, real-time whole-body coordination, and generalization across diverse objects and scenarios, limited by fixed bases, simple grippers, or slow tactile response capabilities. We propose \textbf{FastGrasp}, a…	Yuexin Ma Team	ArXiv
2026-04-14	Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers `LearnedControl` Mobile Manipulation (MoMa) of articulated objects, such as opening doors, drawers, and cupboards, demands simultaneous, whole-body coordination between a robot’s base and arms. Classical whole-body controllers (WBCs) can solve such problems via hierarchical optimization, but require extensive hand-tuned optimization and remain brittle. Learning-based methods, on the other hand, show strong…	Georgia Chalvatzaki Team	ArXiv
2026-04-14	Vectorizing Projection in Manifold-Constrained Motion Planning for Real-Time Whole-Body Control `LearnedControl` Many robot planning tasks require satisfaction of one or more constraints throughout the entire trajectory. For geometric constraints, manifold-constrained motion planning algorithms are capable of planning collision-free path between start and goal configurations on the constraint submanifolds specified by task. Current state-of-the-art methods can take tens of seconds to solve these tasks for…	Zachary Kingston Team	ArXiv
2026-04-13	CLAW: Composable Language-Annotated Whole-body Motion Generation `LearnedControl` Training language-conditioned whole-body controllers for humanoid robots requires large-scale datasets pairing motion trajectories with natural-language descriptions.Existing approaches based on motion capture are costly and limited in diversity, while text-to-motion generative models produce purely kinematic outputs that are not guaranteed to be physically feasible.Therefore, we present CLAW, an…	Masayoshi Tomizuka Team	ArXiv
2026-04-09	HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation `LearnedControl` Humans achieve complex manipulation through coordinated whole-body control, whereas most Vision-Language-Action (VLA) models treat robot body parts largely independently, making high-DoF humanoid control challenging and often unstable. We present HEX, a state-centric framework for coordinated manipulation on full-sized bipedal humanoid robots. HEX introduces a humanoid-aligned universal state…	Badong Chen Team	ArXiv / Web
2026-04-09	Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation `LearnedControl` This paper presents a sim-to-real approach that enables legged robots to dynamically manipulate large and heavy objects with whole-body dexterity. Our key insight is that by performing test-time steering of a pre-trained whole-body control policy with a sample-based planner, we can enable these robots to solve a variety of dynamic loco-manipulation tasks. Interestingly, we find our method…	Simon Le Cléac’h Team	ArXiv
2026-04-08	CMP: Robust Whole-Body Tracking for Loco-Manipulation via Competence Manifold Projection `LearnedControl` While decoupled control schemes for legged mobile manipulators have shown robustness, learning holistic whole-body control policies for tracking global end-effector poses remains fragile against Out-of-Distribution (OOD) inputs induced by sensor noise or infeasible user commands. To improve robustness against these perturbations without sacrificing task performance and continuity, we propose…	Jiwen Lu Team	ArXiv / Web
2026-04-02	MorphoGuard: A Morphology-Based Whole-Body Interactive Motion Controller `LearnedControl` Whole-body control (WBC) has demonstrated significant advantages in complex interactive movements of high-dimensional robotic systems. However, when a robot is required to handle dynamic multi-contact combinations along a single kinematic chain-such as pushing open a door with its elbow while grasping an object-it faces major obstacles in terms of complex contact representation and joint…	Bin He Team	ArXiv
2026-04-01	SMASH: Mastering Scalable Whole-Body Skills for Humanoid Ping-Pong with Egocentric Vision `LearnedControl` Existing humanoid table tennis systems remain limited by their reliance on external sensing and their inability to achieve agile whole-body coordination for precise task execution. These limitations stem from two core challenges: achieving low-latency and robust onboard egocentric perception under fast robot motion, and obtaining sufficiently diverse task-aligned strike motions for learning…	Ping Luo Team	ArXiv
2026-04-01	BAT: Balancing Agility and Stability via Online Policy Switching for Long-Horizon Whole-Body Humanoid Control `LearnedControl` Despite recent advances in control, reinforcement learning, and imitation learning, developing a unified framework that can achieve agile, precise, and robust whole-body behaviors, particularly in long-horizon tasks, remains challenging. Existing approaches typically follow two paradigms: coupled whole-body policies for global coordination and decoupled policies for modular precision. However,…	Sehoon Ha Team	ArXiv
2026-03-31	DreamControl-v2: Simpler and Scalable Autonomous Humanoid Skills via Trainable Guided Diffusion Priors `LearnedControl` Developing robust autonomous loco-manipulation skills for humanoids remains an open problem in robotics. While RL has been applied successfully to legged locomotion, applying it to complex, interaction-rich manipulation tasks is harder given long-horizon planning challenges for manipulation. A recent approach along these lines is DreamControl, which addresses these issues by leveraging…	Jonathan Chung-Kuan Huang Team	ArXiv
2026-03-30	Active Stereo-Camera Outperforms Multi-Sensor Setup in ACT Imitation Learning for Humanoid Manipulation `LearnedControl` The complexity of teaching humanoid robots new tasks is one of the major reasons hindering their widespread adoption in the industry. While Imitation Learning (IL), particularly Action Chunking with Transformers (ACT), enables rapid task acquisition, there is no consensus yet on the optimal sensory hardware required for manipulation tasks. This paper benchmarks 14 sensor combinations on the…	Dennis Bank Team	ArXiv
2026-03-25	SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating `LearnedControl` Recent advances in real-time interactive text-driven motion generation have enabled humanoids to perform diverse behaviors. However, kinematics-only generators often exhibit physical hallucinations, producing motion trajectories that are physically infeasible to track with a downstream motion tracking controller or unsafe for real-world deployment. These failures often arise from the lack of…	Donghan Koo Team	ArXiv
2026-03-23	Make Tracking Easy: Neural Motion Retargeting for Humanoid Whole-body Control `LearnedControl` Humanoid robots require diverse motor skills to integrate into complex environments, but bridging the kinematic and dynamic embodiment gap from human data remains a major bottleneck. We demonstrate through Hessian analysis that traditional optimization-based retargeting is inherently non-convex and prone to local optima, leading to physical artifacts like joint jumps and self-penetration. To…	Xun Cao Team	ArXiv
2026-03-20	Morphology-Consistent Humanoid Interaction through Robot-Centric Video Synthesis `LearnedControl` Equipping humanoid robots with versatile interaction skills typically requires either extensive policy training or explicit human-to-robot motion retargeting. However, learning-based policies face prohibitive data collection costs. Meanwhile, retargeting relies on human-centric pose estimation (e.g., SMPL), introducing a morphology gap. Skeletal scale mismatches result in severe spatial…	Renjing Xu Team	ArXiv
2026-03-19	ADMM-Based Distributed MPC with Control Barrier Functions for Safe Multi-Robot Quadrupedal Locomotion `LearnedControl` This paper proposes a fully decentralized model predictive control (MPC) framework with control barrier function (CBF) constraints for safety-critical trajectory planning in multi-robot legged systems. The incorporation of CBF constraints introduces explicit inter-agent coupling, which prevents direct decomposition of the resulting optimal control problems. To address this challenge, we…	Kaveh Akbari Hamed Team	ArXiv
2026-03-17	Learning Whole-Body Control for a Salamander Robot `Sim2Real` `LearnedControl` Amphibious legged robots inspired by salamanders are promising in applications in complex amphibious environments. However, despite the significant success of training controllers that achieve diverse locomotion behaviors in conventional quadrupedal robots, most salamander robots relied on central-pattern-generator (CPG)-based and model-based coordination strategies for locomotion control….	Auke Ijspeert Team	ArXiv
2026-03-17	ECHO: Edge-Cloud Humanoid Orchestration for Language-to-Motion Control `Sim2Real` `LearnedControl` We present ECHO, an edge–cloud framework for language-driven whole-body control of humanoid robots. A cloud-hosted diffusion-based text-to-motion generator synthesizes motion references from natural language instructions, while an edge-deployed reinforcement-learning tracker executes them in closed loop on the robot. The two modules are bridged by a compact, robot-native 38-dimensional motion…	Yutao Yue Team	ArXiv
2026-03-15	CyboRacket: A Perception-to-Action Framework for Humanoid Racket Sports `LearnedControl` Dynamic ball-interaction tasks remain challenging for robots because they require tight perception-action coupling under limited reaction time. This challenge is especially pronounced in humanoid racket sports, where successful interception depends on accurate visual tracking, trajectory prediction, coordinated stepping, and stable whole-body striking. Existing robotic racket-sport systems often…	Kai Chen Team	ArXiv
2026-03-15	Load-Aware Locomotion Control for Humanoid Robots in Industrial Transportation Tasks `LearnedControl` Humanoid robots deployed in industrial environments are required to perform load-carrying transportation tasks that tightly couple locomotion and manipulation. However, achieving stable and robust locomotion under varying payloads and upper-body motions is challenging due to dynamic coupling and partial observability. This paper presents a load-aware locomotion framework for industrial humanoids…	Shiqi Li Team	ArXiv
2026-03-14	REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning `LearnedControl` Humanoid loco-manipulation requires coordinated high-level motion plans with stable, low-level whole-body execution under complex robot-environment dynamics and long-horizon tasks. While diffusion policies (DPs) show promise for learning from demonstrations, deploying them on humanoids poses critical challenges: the motion planner trained offline is decoupled from the low-level controller,…	Ye Zhao Team	ArXiv
2026-03-13	PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization `LearnedControl` Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC…	Ivan Laptev Team	ArXiv
2026-03-12	$Ψ_0$: An Open Foundation Model Towards Universal Humanoid Loco-Manipulation `LearnedControl` We introduce $Ψ_0$ (Psi-Zero), an open foundation model to address challenging humanoid loco-manipulation tasks. While existing approaches often attempt to address this fundamental problem by co-training on large and diverse human and humanoid data, we argue that this strategy is suboptimal due to the fundamental kinematic and motion disparities between humans and humanoid robots. Therefore, data…	Yue Wang Team	ArXiv
2026-03-11	Cybo-Waiter: A Physical Agentic Framework for Humanoid Whole-Body Locomotion-Manipulation `LearnedControl` Robots are increasingly expected to execute open ended natural language requests in human environments, which demands reliable long horizon execution under partial observability. This is especially challenging for humanoids because locomotion and manipulation are tightly coupled through stance, reachability, and balance. We present a humanoid agent framework that turns VLM plans into verifiable…	Kai Chen Team	ArXiv
2026-03-10	ZeroWBC: Learning Natural Visuomotor Humanoid Control Directly from Human Egocentric Video `LearnedControl` Achieving versatile and naturalistic whole-body control for humanoid robot scene-interaction remains a significant challenge. While some recent works have demonstrated autonomous humanoid interactive control, they are constrained to rigid locomotion patterns and expensive teleoperation data collection, lacking the versatility to execute more human-like natural behaviors such as sitting or…	Xuelong Li Team	ArXiv
2026-03-09	Predictive Control with Indirect Adaptive Laws for Payload Transportation by Quadrupedal Robots `LearnedControl` This paper formally develops a novel hierarchical planning and control framework for robust payload transportation by quadrupedal robots, integrating a model predictive control (MPC) algorithm with a gradient-descent-based adaptive updating law. At the framework’s high level, an indirect adaptive law estimates the unknown parameters of the reduced-order (template) locomotion model under varying…	Kaveh Akbari Hamed Team	ArXiv
2026-03-09	MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation `LearnedControl` Learning natural, stable, and compositionally generalizable whole-body control policies for humanoid robots performing simultaneous locomotion and manipulation (loco-manipulation) remains a fundamental challenge in robotics. Existing reinforcement learning approaches typically rely on a single monolithic policy to acquire multiple skills, which often leads to cross-skill gradient interference and…	Lei Zhang Team	ArXiv / Web
2026-03-08	Low-Cost Teleoperation Extension for Mobile Manipulators `LearnedControl` Teleoperation of mobile bimanual manipulators requires simultaneous control of high-dimensional systems, often necessitating expensive specialized equipment. We present an open-source teleoperation framework that enables intuitive whole body control using readily available commodity hardware. Our system combines smartphone-based head tracking for camera control, leader arms for bilateral…	Pavel Osinenko Team	ArXiv
2026-03-08	InterReal: A Unified Physics-Based Imitation Framework for Learning Human-Object Interaction Skills `LearnedControl` Interaction is one of the core abilities of humanoid robots. However, most existing frameworks focus on non-interactive whole-body control, which limits their practical applicability. In this work, we develop InterReal, a unified physics-based imitation learning framework for Real-world human-object Interaction (HOI) control. InterReal enables humanoid robots to track HOI reference motions,…	Chenjia Bai Team	ArXiv
2026-03-07	ACLM: ADMM-Based Distributed Model Predictive Control for Collaborative Loco-Manipulation `LearnedControl` Collaborative transportation of heavy payloads via loco-manipulation is a challenging yet essential capability for legged robots operating in complex, unstructured environments. Centralized planning methods, e.g., holistic trajectory optimization, capture dynamic coupling among robots and payloads but scale poorly with system size, limiting real-time applicability. In contrast, hierarchical and…	Ye Zhao Team	ArXiv
2026-03-05	PhysiFlow: Physics-Aware Humanoid Whole-Body VLA via Multi-Brain Latent Flow Matching and Robust Tracking `LearnedControl` In the domain of humanoid robot control, the fusion of Vision-Language-Action (VLA) with whole-body control is essential for semantically guided execution of real-world tasks. However, existing methods encounter challenges in terms of low VLA inference efficiency or an absence of effective semantic guidance for whole-body control, resulting in instability in dynamic limb-coordinated tasks. To…	Hesheng Wang Team	ArXiv
2026-03-05	OmniDP: Beyond-FOV Large-Workspace Humanoid Manipulation with Omnidirectional 3D Perception `LearnedControl` The deployment of humanoid robots for dexterous manipulation in unstructured environments remains challenging due to perceptual limitations that constrain the effective workspace. In scenarios where physical constraints prevent the robot from repositioning itself, maintaining omnidirectional awareness becomes far more critical than color or semantic information.While recent advances in visuomotor…	Jun Ma Team	ArXiv