Sim-to-Sim Transfer from Newton to PhysX

When people talk about sim-to-sim transfer, the conversation usually jumps straight to physics fidelity. Contact models, solver stability, actuator realism, friction tuning, integration error, domain randomization. All of that matters. But when I was moving a custom robot workflow from Newton to PhysX in Isaac Lab, the first issue I had to solve was much more fundamental: joint order.

I was working on a custom robot defined in:

source/isaaclab_assets/isaaclab_assets/robots/tensaur.py

At first glance, joint ordering sounds like a bookkeeping detail. In reinforcement learning and motion tracking, it is not. A policy does not understand robot semantics through names like left_knee or right_hip_roll. It understands them through tensor positions. If index 7 meant right_knee during training but means left_hip_pitch during deployment, the policy is no longer controlling the same robot it learned. It is controlling a scrambled version of that robot.

That is the real reason joint ordering matters in sim-to-sim transfer. Before you can ask whether Newton and PhysX produce comparable dynamics, you have to guarantee that the policy is sending the right command to the right degree of freedom.

Why This Problem Shows Up

In Isaac Lab, the articulation exposes joints in the order defined by how the simulator parses the kinematic tree. The repo documentation already hints at this broader issue: articulation joint order is meaningful, and indexing follows the articulation ordering. In practice, that means you should never assume that a motion clip, a learned policy, a USD articulation, and two different physics backends will all agree on the same ordering unless you explicitly make them agree.

For my Tensaur setup, there were really two related but distinct ordering problems:

Motion-data order versus articulation order.
Newton policy order versus PhysX policy order.

Those are easy to conflate, but they happen at different boundaries in the stack. The first is an asset integration problem. A motion clip such as a jump trajectory can store joints in one order, while the Isaac Lab articulation built from the USD can expose them in another.

The second is a backend transfer problem. A policy trained in Newton may expect observations and actions in Newton's joint ordering, while the PhysX environment exposes a different ordering.

Both issues are solved the same way: define the mapping explicitly and apply it consistently.

The Tensaur Fix in the Robot Asset

The core change for my custom robot lives in:

source/isaaclab_assets/isaaclab_assets/robots/tensaur.py

There, I added an explicit canonical joint list:

TENSAUR_JOINT_NAMES = (
    "neck_pitch",
    "head_pitch",
    "head_yaw",
    "tail",
    "right_hip_yaw",
    "right_hip_roll",
    "right_hip_pitch",
    "right_knee",
    "right_ankle",
    "left_hip_yaw",
    "left_hip_roll",
    "left_hip_pitch",
    "left_knee",
    "left_ankle",
)

That list is important because it does not merely document the robot. It defines the intended semantic order of the robot for downstream consumers. In this case, the comment in the file makes the purpose clear: these names match the order used in jump.npz, which in turn matches the motion source used by the controller stack.

This is a subtle but important design choice. Instead of asking the motion system to adapt to whatever order the simulator happens to emit, I anchored the task to the order that the motion data was authored in. That gives me a stable reference frame.

This is especially useful in robotics workflows that mix:

authored trajectories,
tracking objectives,
reinforcement learning policies,
multiple simulators,
and custom robot assets.

Without a canonical ordering, each interface silently makes its own assumptions. The result is usually not a clean failure. The result is a robot that behaves badly for reasons that are hard to diagnose.

Where the Mapping Actually Gets Used

In the Tensaur tracking environment, that happens here:

source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/tracking/config/tensaur/flat_env_cfg.py

The relevant line is:

self.commands.motion.joint_names = TENSAUR_JOINT_NAMES

That line is doing more work than it appears to. It tells the motion command system, "treat the motion clip as being ordered according to this exact joint list."

From there, the actual remapping happens inside:

source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/tracking/mdp/commands.py

The key mechanism is this:

the command reads the configured joint_names,
it calls find_joints(..., preserve_order=True),
and it builds a permutation from motion-clip order to articulation order.

Conceptually, the code is saying: "For each joint in the authored motion clip, find where that same named joint lives in the robot articulation, and use that index when reading or comparing joint states."

That is the right abstraction. The policy or tracking term does not need to care whether the articulation is breadth-first, alphabetical, backend-specific, or USD-parser-dependent. It only needs a correct semantic mapping.

Once that mapping exists, the robot-side tensors can be reordered to match the motion clip. That preserves the meaning of the state vector and keeps tracking losses aligned with the intended joints.

Why This Matters More in Sim-to-Sim Transfer

Motion tracking already makes joint-order bugs visible, because a tracking objective has a clear reference. Sim-to-sim transfer is less forgiving because a deployed policy has no built-in explanation for failure.

If a Newton-trained policy is deployed in PhysX with mismatched joint order, several things can happen:

actions are applied to the wrong joints,
joint positions and velocities are observed in the wrong order,
observation normalization becomes invalid,
gait symmetry breaks in strange ways,
and the resulting behavior looks like poor transfer, even when the policy itself is fine.

This is why I think joint-order auditing should be treated as a first-class step in any sim-to-sim robotics pipeline.

There is a tendency to blame solver differences too early. But if the policy is effectively commanding the wrong robot because the indexing changed, you are not yet evaluating transfer across physics engines. You are evaluating transfer across a permutation bug.

How Isaac Lab Handles Newton-to-PhysX Policy Transfer

Isaac Lab already contains a clean pattern for cross-backend remapping in:

scripts/sim2sim_transfer/rsl_rl_transfer.py

This script takes a YAML file with two lists:

source_joint_names
target_joint_names

The source list is the order the policy was trained on. The target list is the order exposed by the environment where the policy will be executed.

The script then computes two mappings:

one for observations,
one for actions.

The runtime logic is the important part:

Remap observations from the target environment into the source policy's expected ordering.
Run the policy in that source ordering.
Remap the policy's output actions back into the target environment's ordering.

This is exactly what sim-to-sim transfer needs. The policy remains mentally inside its training simulator, while the wrapper translates between the two articulation conventions.

In other words, the transfer script does not fix physics. It fixes meaning. Physics transfer only becomes meaningful after semantic alignment is correct.

What a Mapping File Represents

A mapping YAML in this workflow is not just metadata. It is a contract.

For example, the Newton-to-PhysX configs under:

scripts/sim2sim_transfer/config/

list the same joints twice, but in two different orders:

the order expected by the source backend,
and the order presented by the target backend.

That means the YAML is effectively a declaration of joint-space isomorphism between two simulators. The joints are the same physical degrees of freedom. Only the indexing differs.

For a custom robot like Tensaur, that is the file I would create next for a full Newton-to-PhysX transfer pipeline. The structure is straightforward:

source_joint_names:
  - ...
target_joint_names:
  - ...

The harder part is not the syntax. The harder part is being disciplined enough to verify that both lists refer to the same semantic joints exactly once, with no missing entries, no duplicates, and no naming drift.

What I Learned from Doing This on a Custom Robot

Working on a custom robot exposes problems that standard benchmark robots hide.

With common platforms such as G1, Go2, or ANYmal, a lot of the naming, articulation assumptions, and transfer scripts already exist. With a custom robot, you have to define that layer yourself. That forces you to answer questions that are easy to postpone in a mature stack:

What is the canonical joint order for this robot?
Is that order coming from the motion source, the USD, the controller, or the trained policy?
Are body names also ordered, or only joint names?
Does every consumer of robot state agree on the same convention?

In my Tensaur case, the motion clip ordering mattered enough that I encoded it directly in the asset configuration and passed it into the motion command. That is a good pattern because it localizes the robot-specific convention near the robot definition instead of scattering ad hoc permutations throughout the training code.

It also makes the system easier to reason about later. If I revisit the robot after a few months, I can see immediately that the task depends on a specific authored order and that the remapping is intentional.

The Robotics Lesson Here

There is a broader robotics lesson in all of this: interfaces fail at the level of conventions before they fail at the level of algorithms.

A locomotion policy can be excellent. A simulator can be physically stable. A robot asset can be visually correct.

And the whole system can still fail because one module thinks index 4 is the left ankle while another thinks it is the tail.

This is why production robotics stacks tend to accumulate a lot of explicit interface contracts. Frame conventions. Quaternion conventions. Actuator directions. Sensor axes. Joint limits. Unit systems. Joint ordering. These details are not peripheral. They are the glue that makes model-based control, reinforcement learning, simulation, and deployment actually line up.

Sim-to-sim transfer is often presented as a question of domain gap. That is true, but only after representation gap has been solved.

How I Would Describe the Final Pipeline

Engineering principle First establish a canonical semantic ordering. Then remap every interface to that ordering. Only then evaluate transfer.

For my setup, that means:

Define the intended joint order explicitly in the robot asset.
Use that order when loading and replaying motion data.
If the training and deployment backends disagree, create a source-to-target mapping YAML.
Remap observations before policy inference.
Remap actions before stepping the target simulator.

At that point, if the policy still degrades, the remaining gap is much more likely to be a real physics gap rather than a bookkeeping bug.

Conclusion

The biggest insight from this Newton-to-PhysX transfer work was that sim-to-sim transfer starts earlier than physics. It starts at representation.

For my custom Tensaur robot, changing the joint order was not a cosmetic cleanup. It was the step that made the rest of the pipeline interpretable. By defining TENSAUR_JOINT_NAMES in the asset config, using that order in the tracking environment, and following Isaac Lab's explicit observation and action remapping pattern for cross-backend transfer, I turned a fragile implicit assumption into a reproducible interface.

That is the kind of fix that looks small in code but large in effect. A few lines of ordering metadata can decide whether a policy transfers cleanly or appears to fail for mysterious reasons.

In robotics, that is often how progress looks: not a dramatic new algorithm, but a clearer agreement about what each index actually means.