Actions: verl-project/verl
Actions
4,322 workflow runs
4,322 workflow runs
loss_mask.shape[-1] as in seq-mean-token-sum-norm
e2e_one_step_off_policy
#5944:
Pull request #5417
opened
by
tongyx361
actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice?
e2e_one_step_off_policy
#5940:
Pull request #4263
synchronize
by
khazic
actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice?
e2e_one_step_off_policy
#5935:
Pull request #4263
synchronize
by
khazic
actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice?
e2e_one_step_off_policy
#5925:
Pull request #4263
synchronize
by
khazic
actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice?
e2e_one_step_off_policy
#5924:
Pull request #4263
synchronize
by
khazic