[fsdp,vllm,trainer,algo] feat: On-Policy Distillation #4569
Annotations
3 errors
|
|
|
Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm (GRPO)
The operation was canceled.
|
|
|
Loading