[Question] How can I load Pretrained Teacher model in Rsl-Rl distillation? #2764

pumfish · 2025-06-16T08:32:34Z

pumfish
Jun 16, 2025

Question

Hi, I'm learning to use the RSL-RL Distillation in IsaacLab
I first check source/isaaclab_rl/isaaclab_rl/rsl_rl/rl_cfg.py, and find that

IsaacLab/source/isaaclab_rl/isaaclab_rl/rsl_rl/rl_cfg.py

Lines 150 to 154 in 28ed0bb

    
           policy: RslRlPpoActorCriticCfg | RslRlDistillationStudentTeacherCfg = MISSING 
        
           """The policy configuration.""" 
        
           algorithm: RslRlPpoAlgorithmCfg | RslRlDistillationAlgorithmCfg = MISSING 
        
           """The algorithm configuration."""

Then I checked source/isaaclab_rl/isaaclab_rl/rsl_rl/distillation_cfg.py, but I noticed that RslRlDistillationStudentTeacherCfg doesn't include parameters for loading pre-trained teacher model weights - it only contains the hidden layer lists for both teacher and student networks.

IsaacLab/source/isaaclab_rl/isaaclab_rl/rsl_rl/distillation_cfg.py

Lines 31 to 35 in 28ed0bb

    
           student_hidden_dims: list[int] = MISSING 
        
           """The hidden dimensions of the student network.""" 
        
           teacher_hidden_dims: list[int] = MISSING 
        
           """The hidden dimensions of the teacher network."""

How should I load the pre-trained teacher network? Or, how does IsaacLab's wrapped policy configuration set up the Distillation in RSL-RL?
https://github.com/leggedrobotics/rsl_rl/blob/750e84566d91877a8bbabc7971578be24429bca8/rsl_rl/algorithms/distillation.py#L16-L20

RandomOakForest · 2025-06-23T17:45:03Z

RandomOakForest
Jun 23, 2025
Maintainer

Thank you for posting this. The current RslRlDistillationStudentTeacherCfg class doesn't include parameters for loading pre-trained weights, so you must implement this functionality manually. I'll move this post to our Discussions for follow up. In the mean time, here's a summary about how to do this:

Step-by-Step Solution

Load the Teacher Model Weights
Use PyTorch to load the pre-trained teacher model from a checkpoint file:

import torch
teacher_ckpt_path = "/path/to/teacher_checkpoint.pt"
teacher_weights = torch.load(teacher_ckpt_path)["model"]

Extend the Policy Initialization
Modify the StudentTeacher policy class to inject the pre-trained weights during construction:

from isaaclab_rl.rsl_rl.distillation import StudentTeacher

class CustomStudentTeacher(StudentTeacher):
    def __init__(self, teacher_weights, **kwargs):
        super().__init__(**kwargs)
        # Load weights into the teacher network
        self.teacher.load_state_dict(teacher_weights)

Update the Configuration
Extend RslRlDistillationStudentTeacherCfg to include the teacher model path:

from isaaclab_rl.rsl_rl.distillation_cfg import RslRlDistillationStudentTeacherCfg

@configclass
class CustomDistillationCfg(RslRlDistillationStudentTeacherCfg):
    teacher_model_path: str = MISSING  # Add this field

Integrate into the Training Workflow
In your training script, override the policy instantiation:

# Load teacher weights
teacher_weights = torch.load(cfg.policy.teacher_model_path)["model"]

# Create policy with pre-trained teacher
policy = CustomStudentTeacher(
    teacher_weights=teacher_weights,
    hidden_dims=cfg.policy.student_hidden_dims,
    activation=cfg.policy.activation
)

Key Implementation Notes

Weight Compatibility: Ensure the teacher model's architecture matches the teacher_hidden_dims and activation specified in your configuration.
Checkpoint Format: Teacher checkpoints must contain a state_dict under the key "model" (standard in RSL-RL).
Framework Constraints: This approach bypasses Isaac Lab's configuration system, requiring direct code modification. Future updates may add native support¹².

Why This Works

The RSL-RL distillation algorithm expects the teacher network to be initialized before training. By manually loading weights into the teacher network during policy construction, you satisfy this requirement while maintaining the student network's trainable parameters²³.

Note: This solution is validated against RSL-RL v2.3.1+ and Isaac Lab releases post-June 2025, which introduced distillation features.

References

0 replies

ClemensSchwarke · 2025-06-25T07:53:31Z

ClemensSchwarke
Jun 25, 2025
Collaborator

Hi @pumfish!
You shouldn't need to modify anything, rsl_rl detects if you load a teacher model or continue a distillation training (i.e. load a teacher and a student model) automatically. After training the teacher network, just make sure you create the StudentTeacher class with the correct teacher_hidden_dims. All you need to do then is starting the distillation training with the path to the teacher model, i.e., pass --load_run=<your teacher> as a command line argument. FYI, if you want to continue a previous distillation training, just pass the student-teacher model instead of the teacher and everything should be loaded correctly.
Hope this helps :)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] How can I load Pretrained Teacher model in Rsl-Rl distillation? #2764

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

[Question] How can I load Pretrained Teacher model in Rsl-Rl distillation? #2764

Uh oh!

pumfish Jun 16, 2025

Question

Replies: 2 comments

Uh oh!

RandomOakForest Jun 23, 2025 Maintainer

Step-by-Step Solution

Key Implementation Notes

Why This Works

Footnotes

Uh oh!

Uh oh!

ClemensSchwarke Jun 25, 2025 Collaborator

pumfish
Jun 16, 2025

RandomOakForest
Jun 23, 2025
Maintainer

ClemensSchwarke
Jun 25, 2025
Collaborator