-
Notifications
You must be signed in to change notification settings - Fork 3.4k
WIP: SALM with NeMo Automodel integration for Nemotron Nano V3 LLM backbone #15447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
pzelasko
wants to merge
45
commits into
main
Choose a base branch
from
speechlm2-with-nemo-automodel-merge
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
c2395e0
WIP: bringing Yifan's changes to main
pzelasko 8d4c570
Add workaround for exp_manager issue
pzelasko ff54b12
Support reading indexed JSONL datasets with ShareGPT format
pzelasko b9e7b23
Merge remote-tracking branch 'origin/speechlm-yifan-mod-port' into sp…
pzelasko 4a6324d
Support reading indexed tarred datasets with ShareGPT format
pzelasko 9a2f78a
Refactor for compactness
pzelasko e222048
Fixes for real-life data
pzelasko c538a45
Fixes for real-life data
pzelasko 9fc4b72
Fixes for real-life data
pzelasko 4b4c529
Fixes for missing wids-meta.json
pzelasko fc3dffb
Fixes for tarfile edge cases
pzelasko c45ea47
Fixes for real-world tar files
pzelasko c80ed96
move salm llm init to configure_model
pzelasko 794d300
fix: delayed perception init
pzelasko 0516726
Add AutomodelParallelStrategy for Automodel LLM support
pzelasko c6c818c
Merge branch 'speechlm2-with-nemo-automodel' of https://github.com/NV…
pzelasko 024c8d0
Replace HF Automodel with NeMo Automodel for SALM's LLM backbone
pzelasko c4a2a3b
Update salm default config with new options
pzelasko 162117f
Init fixes
pzelasko 43e1bb1
Fix dtype initialization
pzelasko 20a2824
Fix mesh selection for speech encoder
pzelasko cd6ddf3
Fix for mismatched device_mesh axis names in gradient clipping - use …
pzelasko ff4beab
Fix for using embed_tokens in FSDP context before running forward on …
pzelasko b3658b1
Definitive fix for using embed_tokens outside of llm with fsdp
pzelasko 71b6744
this version actually works with Automodel
pzelasko a5d33d2
fix from_pretrained with transformers v5
pzelasko 1d9ed29
fix from_pretrained with transformers v5
pzelasko aaf828a
fix generate/eval
pzelasko 4732230
fix to_hf
pzelasko f4bb443
Fixes for AutoTokenizer decoding in v5
pzelasko 4c21c4d
Flag to run configure_model() at the end of __init__ for safetensors …
pzelasko e09418c
preliminary: support distributed models in to_hf.py
pzelasko a54828c
fix passing automodel kwargs
pzelasko cf40405
fix
pzelasko 2b7f9d0
Enable inference with model parallelism
pzelasko 05c69b8
Fix for lightning save_hyperparameters() call
pzelasko cf0b97f
Fix for loading into DTensor
pzelasko 5c84827
Accelerate loading DTensor
pzelasko d595b7b
Accelerate loading DTensor
pzelasko b4ec5d2
Accelerate loading DTensor
pzelasko b6c8725
Fix for pe buffers not in ckpt (essentially strict=False)
pzelasko 823c4ab
Add Nemotron Nano v3 prompt formatter with <think> reasoning support
pzelasko b241c67
fix
pzelasko 80ff976
Automodel LoRA support
pzelasko 42678cf
Merge branch 'main' into speechlm2-with-nemo-automodel-merge
pzelasko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check notice
Code scanning / CodeQL
Unused import Note
Copilot Autofix
AI 1 day ago
In general, the correct way to fix an unused import in Python is to remove the import statement if the module is never referenced in the file. This reduces visual clutter, avoids implying unnecessary dependencies, and can slightly speed up module import time.
Here, the best fix is to delete the
import jsonline innemo/collections/common/data/lhotse/text_adapters.py(line 14 in the provided snippet), leaving the rest of the imports unchanged. No additional methods, definitions, or replacement imports are needed, since no code in the shown region usesjson. This change preserves all existing functionality because it only removes an unused symbol.