-
Notifications
You must be signed in to change notification settings - Fork 182
fix: build setup and JIT loading for Blackwell architecture #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… for blackwell architecture)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses JIT compilation and module loading failures on Blackwell (RTX 5090) architecture with CUDA 12.8.1. The fix involves two key changes: disabling pip build isolation during installation to ensure proper linking against system PyTorch/CUDA headers, and refactoring JIT module setup functions to return module objects directly rather than relying on post-compilation imports by name.
- Modified installation script to use
--no-build-isolationflag for requirements installation - Refactored all JIT setup functions (
setup_3dgrt,setup_3dgut,setup_playground,setup_mcmc,setup_gui) to return the compiled module object - Updated corresponding loader functions in tracer and strategy files to use the returned module object directly
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| install_env.sh | Added --no-build-isolation flag to requirements installation to force usage of pre-configured host environment |
| threedgrt_tracer/setup_3dgrt.py | Modified to return the compiled tdgrt module object |
| threedgrt_tracer/tracer.py | Updated loader to assign module directly from setup_3dgrt return value |
| threedgut_tracer/setup_3dgut.py | Modified to return the compiled tdgut module object |
| threedgut_tracer/tracer.py | Updated loader to assign module directly from setup_3dgut return value |
| threedgrut_playground/setup_playground.py | Modified to return the compiled playground_lib module object |
| threedgrut_playground/tracer.py | Updated loader to assign module directly from setup_playground return value |
| threedgrut/strategy/src/setup_mcmc.py | Modified to return the compiled gaussian_mcmc module object |
| threedgrut/strategy/mcmc.py | Updated loader to assign module directly from setup_mcmc return value |
| threedgrut/gui/setup_gui.py | Modified to return the compiled gui_module module object |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| gui_module = jit.load( | ||
| name="lib3dgrut_gui_cc", | ||
| sources=source_paths, | ||
| extra_cflags=cflags, | ||
| extra_cuda_cflags=cuda_cflags, | ||
| extra_include_paths=include_paths, | ||
| ) | ||
| return gui_module |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setup_gui() function now returns the gui_module, but the caller in ps_extension.py (line 31) has not been updated to use this return value. After calling setup_gui(), it still attempts to import lib3dgrut_gui_cc by name (line 32), which will fail with the same ModuleNotFoundError this PR is addressing. The ps_extension.py file should be updated to match the pattern used in other files: tdgui = setup_gui() instead of setup_gui() followed by import lib3dgrut_gui_cc as tdgui.
|
Implemented these changes manually and while it does work for the base config and mcmc config, it appears a similar import error still occurs when trying to use selective_adam as optimizer (lib_optimizers_cc). |
Title: fix: Blackwell (RTX 5090) support for environment setup and JIT loading
Summary
This PR provides a fix for running on Blackwell architecture (RTX 5090) with CUDA 12.8.1.
It builds upon #166 (which addressed JIT import failures for
lib3dgrtandlib3dgut) to includelib_mcmc_ccand resolving a build isolation blocker in the installation script that prevents environment setup on newer systems.Related Issues
Environment
./install_env.shRoot Cause & Proposed Fixes
1. Build Isolation Failures
Issue: During
./install_env.sh,pipcreates an isolated build environment by default. On Blackwell systems with CUDA 12.8.1, these isolated environments often fail to correctly link against the specific system PyTorch/CUDA headers.Fix: Added
--no-build-isolationto thepip installcommands ininstall_env.shto force usage of the pre-configured host environment.2. JIT Module Loading Failure
Issue: As identified in #165, while
torch.utils.cpp_extension.load()successfully compiles the JIT extensions, a subsequent import by name (e.g.,import lib3dgrt_cc) fails withModuleNotFoundError.Fix: - Updated
setup_3dgrt(),setup_3dgut(), andsetup_mcmc()to return the module object directly fromjit.load().tracer.pyand relevant files to assign the plugin directly from that returned object.Steps to Reproduce (Before Fix)
./install_env.shon a Blackwell system (observe build failures).python train.py --config-name apps/nerf_synthetic_3dgrt.yaml ...ModuleNotFoundError: No module named 'lib3dgrt_cc'despite successful compilation logs.