[Megatron] feat: Support routing replay on NPU with performance and compatibility enhancements #604
Annotations
3 errors
|
Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm with validation and saving (FSDP2)
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
|
|
Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm with validation and saving (FSDP2)
Process completed with exit code 1.
|
|
Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm with validation and saving (FSDP2)
Error: failed to run script step: Error: command terminated with non-zero exit code: command terminated with exit code 1
|
Loading