e2e_one_step_off_policy

Actions

All workflows
Workflows
- e2e_one_step_off_policy e2e_one_step_off_policy
- .github/workflows/check-pr-title.yml .github/workflows/check-pr-title.yml
- .github/workflows/e2e_one_step_off_policy_2.yml .github/workflows/e2e_one_step_off_policy_2.yml
- .github/workflows/secrets_scan.yml .github/workflows/secrets_scan.yml
- checkpoint_converter checkpoint_converter
- cluster_analyse cluster_analyse
- CodeQL CodeQL
- Copilot code review Copilot code review
- cpu_unit_tests cpu_unit_tests
- Dependabot Updates Dependabot Updates
- docker-build-ascend-a2 docker-build-ascend-a2
Management
- Caches

e2e_one_step_off_policy

Actions

Loading...
Loading

4,322 workflow runs

[megatron] feat: enhance model offloading and loading for frozen parameters e2e_one_step_off_policy #5946: Pull request #5412 synchronize by RobotGF

6m 7s RobotGF:fix_lora_offload

RobotGF:fix_lora_offload

6m 7s

[ckpt] feat: implement large tensor slicing in vllm rollout and CheckpointEngine for weight updating e2e_one_step_off_policy #5945: Pull request #5378 synchronize by jianjunzhong

5m 25s jianjunzhong:feat/chunked_weight_update

jianjunzhong:feat/chunked_weight_update

5m 25s

[algo] fix: seq mean and default scale factor loss_mask.shape[-1] as in seq-mean-token-sum-norm e2e_one_step_off_policy #5944: Pull request #5417 opened by tongyx361

26m 19s tongyx361:tyx/fix/seq-mean-in-seq-mean-token-sum-norm

tongyx361:tyx/fix/seq-mean-in-seq-mean-token-sum-norm

26m 19s

[tool] feature: scheduling analysis based on profiling data for torch profiler e2e_one_step_off_policy #5943: Pull request #5367 synchronize by Rhetee

Action required Rhetee:main

Rhetee:main

Action required

[perf, trtllm] feat: Add Nsight support for rollout server mode (trtllm) e2e_one_step_off_policy #5942: Pull request #5391 synchronize by davidmlw

13m 34s joyang-nv:liweim/nsys

joyang-nv:liweim/nsys

13m 34s

[Megatron] feat: Support routing replay on NPU with performance and compatibility enhancements e2e_one_step_off_policy #5941: Pull request #5298 synchronize by 755651978

14m 37s 755651978:main-0212

755651978:main-0212

14m 37s

why are there multiple settings for actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice? e2e_one_step_off_policy #5940: Pull request #4263 synchronize by khazic

Action required khazic:main

khazic:main

Action required

[algo] feat: add DPPO with binary TV or binary KL implementation (#5397) e2e_one_step_off_policy #5939: Commit 182383b pushed by tongyx361

1h 15m 45s main

main

1h 15m 45s

[Megatron] feat: Support routing replay on NPU with performance and compatibility enhancements e2e_one_step_off_policy #5938: Pull request #5298 synchronize by 755651978

39m 14s 755651978:main-0212

755651978:main-0212

39m 14s

[Megatron] feat: Support routing replay on NPU with performance and compatibility enhancements e2e_one_step_off_policy #5936: Pull request #5298 synchronize by 755651978

14m 46s 755651978:main-0212

755651978:main-0212

14m 46s

why are there multiple settings for actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice? e2e_one_step_off_policy #5935: Pull request #4263 synchronize by khazic

Action required khazic:main

khazic:main

Action required

[fsdp,algo] feat: Support QAT (NVFP4) in FSDPEngine for the unified engine_workers architecture e2e_one_step_off_policy #5934: Pull request #5411 synchronize by zhangyimi

Action required zhangyimi:qat-core-v2

zhangyimi:qat-core-v2

Action required

[fsdp,algo] feat: Support QAT (NVFP4) in FSDPEngine for the unified engine_workers architecture e2e_one_step_off_policy #5933: Pull request #5411 synchronize by zhangyimi

Action required zhangyimi:qat-core-v2

zhangyimi:qat-core-v2

Action required

[fsdp,algo] feat: Support QAT (NVFP4) in FSDPEngine for the unified engine_workers architecture e2e_one_step_off_policy #5932: Pull request #5411 opened by zhangyimi

Action required zhangyimi:qat-core-v2

zhangyimi:qat-core-v2

Action required

[misc,trainer,rollout] feat: add Prometheus metrics logging to experiment tracking e2e_one_step_off_policy #5930: Pull request #5291 synchronize by guillemgt

Action required guillemgt:guillem.tarrach/upstream-prometheus-metrics

guillemgt:guillem.tarrach/upstream-prometheus-metrics

Action required

[misc,trainer,rollout] feat: add Prometheus metrics logging to experiment tracking e2e_one_step_off_policy #5929: Pull request #5291 synchronize by guillemgt

Action required guillemgt:guillem.tarrach/upstream-prometheus-metrics

guillemgt:guillem.tarrach/upstream-prometheus-metrics

Action required

[trainer] feat: add support for the GDPO algorithm e2e_one_step_off_policy #5928: Pull request #5409 opened by yue-zeng-yue

Action required yue-zeng-yue:feat-gdpo

yue-zeng-yue:feat-gdpo

Action required

[trainer,misc] fix: fix multiple bugs in fully async trainings e2e_one_step_off_policy #5927: Pull request #5373 synchronize by guillemgt

Action required guillemgt:upstream/fix-fully-async-bugs

guillemgt:upstream/fix-fully-async-bugs

Action required

[trainer] feat: Add Nemo-Automodel as alternative training engine e2e_one_step_off_policy #5926: Pull request #5407 opened by HuiyingLi

59m 44s HuiyingLi:add_automodel_sft_backend

HuiyingLi:add_automodel_sft_backend

59m 44s

why are there multiple settings for actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice? e2e_one_step_off_policy #5925: Pull request #4263 synchronize by khazic

Action required khazic:main

khazic:main

Action required

why are there multiple settings for actor_rollout_ref.model.enable_gradient_checkpointing? Is this a deliberate design choice? e2e_one_step_off_policy #5924: Pull request #4263 synchronize by khazic

Action required khazic:main

khazic:main

Action required

[perf, trtllm] feat: Add Nsight support for rollout server mode (trtllm) e2e_one_step_off_policy #5923: Pull request #5391 synchronize by davidmlw

14m 52s joyang-nv:liweim/nsys

joyang-nv:liweim/nsys

14m 52s

[ckpt] feat: implement large tensor slicing in vllm rollout and CheckpointEngine for weight updating e2e_one_step_off_policy #5922: Pull request #5378 synchronize by jianjunzhong

14m 45s jianjunzhong:feat/chunked_weight_update

jianjunzhong:feat/chunked_weight_update

14m 45s

[megatron] fix: missing model offload to CPU for forward_only mode e2e_one_step_off_policy #5921: Pull request #5406 opened by xhx1022

14m 58s xhx1022:xhx/offload

xhx1022:xhx/offload

14m 58s

[ckpt] feat: implement large tensor slicing in vllm rollout and CheckpointEngine for weight updating e2e_one_step_off_policy #5920: Pull request #5378 synchronize by jianjunzhong

Action required jianjunzhong:feat/chunked_weight_update

jianjunzhong:feat/chunked_weight_update

Action required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions

Workflows

Management