Make unroll_length schedulable by QuantuMope · Pull Request #1833 · HorizonRobotics/alf

QuantuMope · 2026-03-30T20:50:10Z

This PR allows for a scheduled unroll length if we are running synced off-policy RL training:

async_unroll=False
whole_replay_buffer_training=False

It also allows for a scheduled value of 0, which in turn skips unrolling to train from the replay buffer.

This allows us to "simulate" very diverse training strategies. E.g.,

unroll_length = StepScheduler("iterations", [
    # unroll to collect an "offline" dataset
    (1, int(initial_collect_steps / num_para_envs)),        
    # perform offline training iterations. no unroll 
    (offline_training_iters, 0),     
    # continue with online RL                                    
    (offline_training_iters + 1, desired_unroll_length)])

Codex cleverly makes a minimal change with full backward compatibility by adding the following code

    @property
    def unroll_length(self):
        return self._unroll_length()

    @unroll_length.setter
    def unroll_length(self, value):
        self._unroll_length = as_scheduler(value)

Haichao-Zhang · 2026-03-30T21:15:14Z

        self.unroll_with_grad = unroll_with_grad
        self.use_root_inputs_for_after_train_iter = use_root_inputs_for_after_train_iter
        self.async_unroll = async_unroll
+        if not isinstance(self._unroll_length, ConstantScheduler):


ConstantScheduler --> should check against a base class, e.g. Scheduler?

We need to check against ConstantScheduler here because a scalar input will be converted to one before this check due to the setter function on line 479.

Any non-constant scheduler should then raise an error if we're doing on-policy or async unroll.

QuantuMope

Hey Haichao, responded to your comment. Let me know if I misunderstood it.

QuantuMope · 2026-04-01T18:52:32Z

        self.unroll_with_grad = unroll_with_grad
        self.use_root_inputs_for_after_train_iter = use_root_inputs_for_after_train_iter
        self.async_unroll = async_unroll
+        if not isinstance(self._unroll_length, ConstantScheduler):


We need to check against ConstantScheduler here because a scalar input will be converted to one before this check due to the setter function on line 479.

Any non-constant scheduler should then raise an error if we're doing on-policy or async unroll.

QuantuMope · 2026-05-27T18:30:43Z

Gentle reminder for review. Thanks

Make unroll_length schedulable

de5d3c1

QuantuMope requested review from Haichao-Zhang and emailweixu March 30, 2026 20:51

Haichao-Zhang reviewed Mar 30, 2026

View reviewed changes

QuantuMope commented Apr 1, 2026

View reviewed changes

Haichao-Zhang approved these changes May 27, 2026

View reviewed changes

QuantuMope merged commit 8954c95 into pytorch May 27, 2026
2 checks passed

QuantuMope deleted the PR/andrew/schedulable-unroll-length branch May 27, 2026 20:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make unroll_length schedulable#1833

Make unroll_length schedulable#1833
QuantuMope merged 1 commit into
pytorchfrom
PR/andrew/schedulable-unroll-length

QuantuMope commented Mar 30, 2026 •

edited

Loading

Uh oh!

Haichao-Zhang Mar 30, 2026

Uh oh!

QuantuMope Apr 1, 2026

Uh oh!

QuantuMope left a comment

Uh oh!

QuantuMope Apr 1, 2026

Uh oh!

QuantuMope commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

QuantuMope commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Haichao-Zhang Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

QuantuMope Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

QuantuMope left a comment

Choose a reason for hiding this comment

Uh oh!

QuantuMope Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

QuantuMope commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

QuantuMope commented Mar 30, 2026 •

edited

Loading