[fsdp,vllm,trainer,algo] feat: On-Policy Distillation #5911
e2e_one_step_off_policy.yml
on: pull_request
setup
8s
e2e_one_step_off_policy_fsdp2
3m 9s
e2e_one_step_off_policy_megatron
2m 58s
cleanup
3s