[algo] fix: seq mean and default scale factor loss_mask.shape[-1] as in seq-mean-token-sum-norm
#5944
| Job | Run time |
|---|---|
| 8s | |
| 3m 6s | |
| 3m 6s | |
| 5s | |
| 6m 25s |
loss_mask.shape[-1] as in seq-mean-token-sum-norm
#5944
| Job | Run time |
|---|---|
| 8s | |
| 3m 6s | |
| 3m 6s | |
| 5s | |
| 6m 25s |