Nvidia DGX Spark（GB10 BlackWell），docker部署，启动报错

### 🔎 Search before asking | 提交之前请先搜索

- [x] I have searched the MinerU [Readme](https://github.com/opendatalab/MinerU) and found no similar bug report.
- [x] I have searched the MinerU [Issues](https://github.com/opendatalab/MinerU/issues) and found no similar bug report.
- [x] I have searched the MinerU [Discussions](https://github.com/opendatalab/MinerU/discussions) and found no similar bug report.

### 🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询

- [x] I have consulted the [online AI assistant](https://deepwiki.com/opendatalab/MinerU) but was unable to obtain a solution to the issue.

### Description of the bug | 错误描述

2026-01-12 01:08:33.619710870 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2026-01-12 01:08:40.227 | WARNING  | mineru.utils.pdf_page_id:get_end_page_id:8 - end_page_id is out of range, use images length
Start MinerU FastAPI Service: http://0.0.0.0:8000
API documentation: http://0.0.0.0:8000/docs
INFO 01-12 01:08:42 [__init__.py:216] Automatically detected platform cuda.
/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:283: UserWarning:
    Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
    Minimum and Maximum cuda capability supported by this version of PyTorch is
    (8.0) - (12.0)

  warnings.warn(
2026-01-12 01:08:45.383 | INFO     | mineru.backend.vlm.utils:enable_custom_logits_processors:46 - compute_capability: 12.1 >= 8.0 and vllm version: 0.11.0 >= 0.10.1, enable custom_logits_processors
INFO 01-12 01:08:50 [model.py:547] Resolved architecture: Qwen2VLForConditionalGeneration
`torch_dtype` is deprecated! Use `dtype` instead!
INFO 01-12 01:08:50 [model.py:1510] Using max model len 16384
INFO 01-12 01:08:50 [scheduler.py:205] Chunked prefill is enabled with max_num_batched_tokens=5120.
WARNING 01-12 01:08:50 [__init__.py:3036] We must use the `spawn` multiprocessing start method. Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See https://docs.vllm.ai/en/latest/usage/troubleshooting.html#python-multiprocessing for more information. Reasons: CUDA is initialized
2026-01-12 01:08:51.185492900 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
INFO 01-12 01:08:52 [__init__.py:216] Automatically detected platform cuda.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:53 [core.py:644] Waiting for init message from front-end.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:53 [core.py:77] Initializing a V1 LLM engine (v0.11.0) with config: model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', speculative_config=None, tokenizer='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=16384, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B, enable_prefix_caching=True, chunked_prefill_enabled=True, pooler_config=None, compilation_config={"level":3,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":[],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output","vllm.mamba_mixer2","vllm.mamba_mixer","vllm.short_conv","vllm.linear_attention","vllm.plamo2_mamba_mixer","vllm.gdn_attention","vllm.sparse_attn_indexer"],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"cudagraph_mode":[2,1],"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"use_inductor_graph_partition":false,"pass_config":{},"max_capture_size":256,"local_cache_dir":null}
(EngineCore_DP0 pid=60) /usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:283: UserWarning:
(EngineCore_DP0 pid=60)     Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
(EngineCore_DP0 pid=60)     Minimum and Maximum cuda capability supported by this version of PyTorch is
(EngineCore_DP0 pid=60)     (8.0) - (12.0)
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60)   warnings.warn(
(EngineCore_DP0 pid=60) W0112 01:08:54.346000 60 torch/utils/cpp_extension.py:2425] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
(EngineCore_DP0 pid=60) W0112 01:08:54.346000 60 torch/utils/cpp_extension.py:2425] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
(EngineCore_DP0 pid=60) INFO 01-12 01:08:55 [parallel_state.py:1208] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
(EngineCore_DP0 pid=60) INFO 01-12 01:08:55 [topk_topp_sampler.py:55] Using FlashInfer for top-p & top-k sampling.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [gpu_model_runner.py:2602] Starting to load model /root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B...
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [gpu_model_runner.py:2634] Loading model from scratch...
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [cuda.py:366] Using Flash Attention backend on V1 engine.
Loading safetensors checkpoint shards:   0% Completed | 0/1 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 100% Completed | 1/1 [00:14<00:00, 14.42s/it]
Loading safetensors checkpoint shards: 100% Completed | 1/1 [00:14<00:00, 14.42s/it]
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [default_loader.py:267] Loading weights took 14.54 seconds
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [gpu_model_runner.py:2653] Model loading took 2.1637 GiB and 14.959749 seconds
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [gpu_model_runner.py:3344] Encoder cache will be initialized with a budget of 14175 tokens, and profiled with 1 video items of the maximum feature size.
(EngineCore_DP0 pid=60) Process EngineCore_DP0:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] EngineCore failed to start.
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Traceback (most recent call last):
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 424, in make_cubin
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     subprocess.run(ptxas_cmd, check=True, close_fds=False, stderr=flog)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/lib/python3.12/subprocess.py", line 571, in run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     raise CalledProcessError(retcode, process.args,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] subprocess.CalledProcessError: Command '['/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas', '-lineinfo', '-v', '--gpu-name=sm_121a', '/tmp/tmpnt_7bh0a.ptx', '-o', '/tmp/tmpnt_7bh0a.ptx.o']' returned non-zero exit status 255.
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] During handling of the above exception, another exception occurred:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Traceback (most recent call last):
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 498, in __init__
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     super().__init__(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 92, in __init__
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     self._initialize_kv_caches(vllm_config)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 190, in _initialize_kv_caches
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     self.model_executor.determine_available_memory())
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 85, in determine_available_memory
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 83, in collective_rpc
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils/__init__.py", line 3122, in run_method
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     self.model_runner.profile_run()
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 3361, in profile_run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     self.model.get_multimodal_embeddings(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1462, in get_multimodal_embeddings
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     video_embeddings = self._process_video_input(video_input)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1412, in _process_video_input
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     video_embeds = self.visual(pixel_values_videos,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 739, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     x = blk(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]         ^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 489, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     x = x + self.attn(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]             ^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 384, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     qk_rotated = apply_rotary_pos_emb_vision(qk_concat, rotary_pos_emb)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 283, in apply_rotary_pos_emb_vision
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     output = apply_rotary_emb(t_, cos, sin).type_as(t)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 124, in apply_rotary_emb
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return ApplyRotaryEmb.apply(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py", line 576, in apply
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return super().apply(*args, **kwargs)  # type: ignore[misc]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 50, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     out = apply_rotary(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]           ^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/ops/triton/rotary.py", line 203, in apply_rotary
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     rotary_kernel[grid](
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 390, in <lambda>
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 594, in run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     kernel = self.compile(src, target=target, options=options.__dict__)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 359, in compile
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     next_module = compile_ir(module, metadata)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 461, in <lambda>
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     stages["cubin"] = lambda src, metadata: self.make_cubin(src, metadata, options, self.target.arch)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 442, in make_cubin
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]     raise PTXASError(f"{error}\n"
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] triton.runtime.errors.PTXASError: PTXAS error: Internal Triton PTX codegen error
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] `ptxas` stderr:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ptxas fatal   : Value 'sm_121a' is not defined for option 'gpu-name'
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Repro command: /usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas -lineinfo -v --gpu-name=sm_121a /tmp/tmpnt_7bh0a.ptx -o /tmp/tmpnt_7bh0a.ptx.o
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) Traceback (most recent call last):
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 424, in make_cubin
(EngineCore_DP0 pid=60)     subprocess.run(ptxas_cmd, check=True, close_fds=False, stderr=flog)
(EngineCore_DP0 pid=60)   File "/usr/lib/python3.12/subprocess.py", line 571, in run
(EngineCore_DP0 pid=60)     raise CalledProcessError(retcode, process.args,
(EngineCore_DP0 pid=60) subprocess.CalledProcessError: Command '['/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas', '-lineinfo', '-v', '--gpu-name=sm_121a', '/tmp/tmpnt_7bh0a.ptx', '-o', '/tmp/tmpnt_7bh0a.ptx.o']' returned non-zero exit status 255.
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) During handling of the above exception, another exception occurred:
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) Traceback (most recent call last):
(EngineCore_DP0 pid=60)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore_DP0 pid=60)     self.run()
(EngineCore_DP0 pid=60)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=60)     self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 712, in run_engine_core
(EngineCore_DP0 pid=60)     raise e
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=60)     engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=60)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 498, in __init__
(EngineCore_DP0 pid=60)     super().__init__(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 92, in __init__
(EngineCore_DP0 pid=60)     self._initialize_kv_caches(vllm_config)
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 190, in _initialize_kv_caches
(EngineCore_DP0 pid=60)     self.model_executor.determine_available_memory())
(EngineCore_DP0 pid=60)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 85, in determine_available_memory
(EngineCore_DP0 pid=60)     return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 83, in collective_rpc
(EngineCore_DP0 pid=60)     return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=60)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/utils/__init__.py", line 3122, in run_method
(EngineCore_DP0 pid=60)     return func(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=60)     return func(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory
(EngineCore_DP0 pid=60)     self.model_runner.profile_run()
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 3361, in profile_run
(EngineCore_DP0 pid=60)     self.model.get_multimodal_embeddings(
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1462, in get_multimodal_embeddings
(EngineCore_DP0 pid=60)     video_embeddings = self._process_video_input(video_input)
(EngineCore_DP0 pid=60)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1412, in _process_video_input
(EngineCore_DP0 pid=60)     video_embeds = self.visual(pixel_values_videos,
(EngineCore_DP0 pid=60)                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60)     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60)     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 739, in forward
(EngineCore_DP0 pid=60)     x = blk(
(EngineCore_DP0 pid=60)         ^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60)     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60)     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 489, in forward
(EngineCore_DP0 pid=60)     x = x + self.attn(
(EngineCore_DP0 pid=60)             ^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60)     return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60)     return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 384, in forward
(EngineCore_DP0 pid=60)     qk_rotated = apply_rotary_pos_emb_vision(qk_concat, rotary_pos_emb)
(EngineCore_DP0 pid=60)                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 283, in apply_rotary_pos_emb_vision
(EngineCore_DP0 pid=60)     output = apply_rotary_emb(t_, cos, sin).type_as(t)
(EngineCore_DP0 pid=60)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 124, in apply_rotary_emb
(EngineCore_DP0 pid=60)     return ApplyRotaryEmb.apply(
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py", line 576, in apply
(EngineCore_DP0 pid=60)     return super().apply(*args, **kwargs)  # type: ignore[misc]
(EngineCore_DP0 pid=60)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 50, in forward
(EngineCore_DP0 pid=60)     out = apply_rotary(
(EngineCore_DP0 pid=60)           ^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/ops/triton/rotary.py", line 203, in apply_rotary
(EngineCore_DP0 pid=60)     rotary_kernel[grid](
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 390, in <lambda>
(EngineCore_DP0 pid=60)     return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
(EngineCore_DP0 pid=60)                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 594, in run
(EngineCore_DP0 pid=60)     kernel = self.compile(src, target=target, options=options.__dict__)
(EngineCore_DP0 pid=60)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 359, in compile
(EngineCore_DP0 pid=60)     next_module = compile_ir(module, metadata)
(EngineCore_DP0 pid=60)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 461, in <lambda>
(EngineCore_DP0 pid=60)     stages["cubin"] = lambda src, metadata: self.make_cubin(src, metadata, options, self.target.arch)
(EngineCore_DP0 pid=60)                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60)   File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 442, in make_cubin
(EngineCore_DP0 pid=60)     raise PTXASError(f"{error}\n"
(EngineCore_DP0 pid=60) triton.runtime.errors.PTXASError: PTXAS error: Internal Triton PTX codegen error
(EngineCore_DP0 pid=60) `ptxas` stderr:
(EngineCore_DP0 pid=60) ptxas fatal   : Value 'sm_121a' is not defined for option 'gpu-name'
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) Repro command: /usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas -lineinfo -v --gpu-name=sm_121a /tmp/tmpnt_7bh0a.ptx -o /tmp/tmpnt_7bh0a.ptx.o
(EngineCore_DP0 pid=60)
[rank0]:[W112 01:09:14.407408171 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2026-01-12 01:09:15.282 | ERROR    | mineru.cli.fast_api:parse_pdf:332 - Engine core initialization failed. See root cause above. Failed core proc(s): {}
Traceback (most recent call last):

  File "/usr/local/bin/mineru-api", line 7, in <module>
    sys.exit(main())
    │   │    └ <Command main>
    │   └ <built-in function exit>
    └ <module 'sys' (built-in)>
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1462, in __call__
    return self.main(*args, **kwargs)
           │    │     │       └ {}
           │    │     └ ()
           │    └ <function Command.main at 0xf471780d4180>
           └ <Command main>
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1383, in main
    rv = self.invoke(ctx)
         │    │      └ <click.core.Context object at 0xf47178d15460>
         │    └ <function Command.invoke at 0xf471780cfe20>
         └ <Command main>
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1246, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           │   │      │    │           │   └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
           │   │      │    │           └ <click.core.Context object at 0xf47178d15460>
           │   │      │    └ <function main at 0xf46fa1f3dda0>
           │   │      └ <Command main>
           │   └ <function Context.invoke at 0xf471780cf060>
           └ <click.core.Context object at 0xf47178d15460>
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 814, in invoke
    return callback(*args, **kwargs)
           │         │       └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
           │         └ ()
           └ <function main at 0xf46fa1f3dda0>
  File "/usr/local/lib/python3.12/dist-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           │ │                       │       └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
           │ │                       └ ()
           │ └ <function get_current_context at 0xf471780ad760>
           └ <function main at 0xf46fa1f3df80>
  File "/usr/local/lib/python3.12/dist-packages/mineru/cli/fast_api.py", line 362, in main
    uvicorn.run(
    │       └ <function run at 0xf471780e3380>
    └ <module 'uvicorn' from '/usr/local/lib/python3.12/dist-packages/uvicorn/__init__.py'>
  File "/usr/local/lib/python3.12/dist-packages/uvicorn/main.py", line 593, in run
    server.run()
    │      └ <function Server.run at 0xf47177f1ef20>
    └ <uvicorn.server.Server object at 0xf46fa1ecdb50>
  File "/usr/local/lib/python3.12/dist-packages/uvicorn/server.py", line 67, in run
    return asyncio_run(self.serve(sockets=sockets), loop_factory=self.config.get_loop_factory())
           │           │    │             │                      │    │      └ <function Config.get_loop_factory at 0xf471780e2f20>
           │           │    │             │                      │    └ <uvicorn.config.Config object at 0xf470881d28d0>
           │           │    │             │                      └ <uvicorn.server.Server object at 0xf46fa1ecdb50>
           │           │    │             └ None
           │           │    └ <function Server.serve at 0xf47177f1efc0>
           │           └ <uvicorn.server.Server object at 0xf46fa1ecdb50>
           └ <function run at 0xf47178e9c680>
  File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           │      │   └ <coroutine object Server.serve at 0xf46fa1f35ee0>
           │      └ <function Runner.run at 0xf4717830ede0>
           └ <asyncio.runners.Runner object at 0xf47088157800>
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           │    │     │                  └ <Task pending name='Task-1' coro=<Server.serve() running at /usr/local/lib/python3.12/dist-packages/uvicorn/server.py:71> wai...
           │    │     └ <cyfunction Loop.run_until_complete at 0xf46fa1f4fc60>
           │    └ <uvloop.Loop running=True closed=False debug=False>
           └ <asyncio.runners.Runner object at 0xf47088157800>
  File "/usr/local/lib/python3.12/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
                   └ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0xf46fa2093860>
  File "/usr/local/lib/python3.12/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
                 │    │   │      │        └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
                 │    │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
                 │    │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
                 │    └ <fastapi.applications.FastAPI object at 0xf470880f7590>
                 └ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0xf46fa2093860>
  File "/usr/local/lib/python3.12/dist-packages/fastapi/applications.py", line 1133, in __call__
    await super().__call__(scope, receive, send)
                           │      │        └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
                           │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
                           └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
  File "/usr/local/lib/python3.12/dist-packages/starlette/applications.py", line 113, in __call__
    await self.middleware_stack(scope, receive, send)
          │    │                │      │        └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
          │    │                │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │                └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <starlette.middleware.errors.ServerErrorMiddleware object at 0xf46fa1bf5c40>
          └ <fastapi.applications.FastAPI object at 0xf470880f7590>
  File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
          │    │   │      │        └ <function ServerErrorMiddleware.__call__.<locals>._send at 0xf46fa1c6b9c0>
          │    │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <starlette.middleware.gzip.GZipMiddleware object at 0xf46fa1bf4b00>
          └ <starlette.middleware.errors.ServerErrorMiddleware object at 0xf46fa1bf5c40>
  File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 29, in __call__
    await responder(scope, receive, send)
          │         │      │        └ <function ServerErrorMiddleware.__call__.<locals>._send at 0xf46fa1c6b9c0>
          │         │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │         └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          └ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
  File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 130, in __call__
    await super().__call__(scope, receive, send)
                           │      │        └ <function ServerErrorMiddleware.__call__.<locals>._send at 0xf46fa1c6b9c0>
                           │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
                           └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
  File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 46, in __call__
    await self.app(scope, receive, self.send_with_compression)
          │    │   │      │        │    └ <function IdentityResponder.send_with_compression at 0xf47176f5a0c0>
          │    │   │      │        └ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
INFO:     192.168.3.56:54656 - "POST /file_parse HTTP/1.1" 500 Internal Server Error
          │    │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0xf46fa1bf5c10>
          └ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
  File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/exceptions.py", line 63, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
          │                            │    │    │     │      │        └ <bound method IdentityResponder.send_with_compression of <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>>
          │                            │    │    │     │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │                            │    │    │     └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │                            │    │    └ <starlette.requests.Request object at 0xf46fa1eb83b0>
          │                            │    └ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
          │                            └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0xf46fa1bf5c10>
          └ <function wrap_app_handling_exceptions at 0xf4717710c5e0>
  File "/usr/local/lib/python3.12/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
          │   │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          └ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
  File "/usr/local/lib/python3.12/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
          │    │   │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │    │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <fastapi.routing.APIRouter object at 0xf46fa1ecd460>
          └ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
  File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 716, in __call__
    await self.middleware_stack(scope, receive, send)
          │    │                │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │    │                │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │                └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <bound method Router.app of <fastapi.routing.APIRouter object at 0xf46fa1ecd460>>
          └ <fastapi.routing.APIRouter object at 0xf46fa1ecd460>
  File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 736, in app
    await route.handle(scope, receive, send)
          │     │      │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │     │      │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │     │      └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │     └ <function Route.handle at 0xf4717710da80>
          └ APIRoute(path='/file_parse', name='parse_pdf', methods=['POST'])
  File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 290, in handle
    await self.app(scope, receive, send)
          │    │   │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │    │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │    │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │    └ <function request_response.<locals>.app at 0xf46fa1f3db20>
          └ APIRoute(path='/file_parse', name='parse_pdf', methods=['POST'])
  File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 123, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
          │                            │    │        │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bba0>
          │                            │    │        │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │                            │    │        └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          │                            │    └ <starlette.requests.Request object at 0xf46fa1bf65d0>
          │                            └ <function request_response.<locals>.app.<locals>.app at 0xf46fa1c6bc40>
          └ <function wrap_app_handling_exceptions at 0xf4717710c5e0>
  File "/usr/local/lib/python3.12/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
          │   │      │        └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0xf46fa1c6bd80>
          │   │      └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
          │   └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
          └ <function request_response.<locals>.app.<locals>.app at 0xf46fa1c6bc40>
  File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 109, in app
    response = await f(request)
                     │ └ <starlette.requests.Request object at 0xf46fa1bf65d0>
                     └ <function get_request_handler.<locals>.app at 0xf46fa1f3dd00>
  File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 387, in app
    raw_response = await run_endpoint_function(
                         └ <function run_endpoint_function at 0xf4717710d580>
  File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 288, in run_endpoint_function
    return await dependant.call(**values)
                 │         │      └ {'files': [UploadFile(filename='en规范.pdf', size=763496, headers=Headers({'content-disposition': 'form-data; name="files"; fil...
                 │         └ <function parse_pdf at 0xf46fa1f3d8a0>
                 └ Dependant(path_params=[], query_params=[], header_params=[], cookie_params=[], body_params=[ModelField(field_info=File(Pydant...
> File "/usr/local/lib/python3.12/dist-packages/mineru/cli/fast_api.py", line 215, in parse_pdf
    await aio_do_parse(
          └ <function aio_do_parse at 0xf46fa1f3cd60>
  File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 532, in aio_do_parse
    await _async_process_vlm(
          └ <function _async_process_vlm at 0xf46fa1f3c9a0>
  File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 253, in _async_process_vlm
    middle_json, infer_result = await aio_vlm_doc_analyze(
                                      └ <function aio_doc_analyze at 0xf46fa1f3c680>
  File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 230, in aio_doc_analyze
    predictor = ModelSingleton().get_model(backend, model_path, server_url, **kwargs)
                │                          │        │           │             └ {'gpu_memory_utilization': 0.4}
                │                          │        │           └ None
                │                          │        └ None
                │                          └ 'vllm-async-engine'
                └ <class 'mineru.backend.vlm.vlm_analyze.ModelSingleton'>
  File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 125, in get_model
    vllm_async_llm = AsyncLLM.from_engine_args(AsyncEngineArgs(**kwargs))
                     │        │                │                 └ {'gpu_memory_utilization': 0.4, 'model': '/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', 'logits_pr...
                     │        │                └ <class 'vllm.engine.arg_utils.AsyncEngineArgs'>
                     │        └ <classmethod(<function AsyncLLM.from_engine_args at 0xf46f6cde7ce0>)>
                     └ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 235, in from_engine_args
    return cls(
           └ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 134, in __init__
    self.engine_core = EngineCoreClient.make_async_mp_client(
    │                  │                └ <staticmethod(<function EngineCoreClient.make_async_mp_client at 0xf46f6cdf39c0>)>
    │                  └ <class 'vllm.v1.engine.core_client.EngineCoreClient'>
    └ <vllm.v1.engine.async_llm.AsyncLLM object at 0xf46d50d19760>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 102, in make_async_mp_client
    return AsyncMPClient(*client_args)
           │              └ (VllmConfig(model_config=ModelConfig(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', runner='...
           └ <class 'vllm.v1.engine.core_client.AsyncMPClient'>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 769, in __init__
    super().__init__(
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 448, in __init__
    with launch_core_engines(vllm_config, executor_class,
         │                   │            └ <class 'vllm.v1.executor.abstract.UniProcExecutor'>
         │                   └ VllmConfig(model_config=ModelConfig(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', runner='a...
         └ <function launch_core_engines at 0xf46f6cdf13a0>
  File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
    next(self.gen)
         │    └ <generator object launch_core_engines at 0xf46d50bd2510>
         └ <contextlib._GeneratorContextManager object at 0xf46d4eb7b950>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 732, in launch_core_engines
    wait_for_engine_startup(
    └ <function wait_for_engine_startup at 0xf46f6cdf1440>
  File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 785, in wait_for_engine_startup
    raise RuntimeError("Engine core initialization failed. "

RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

### How to reproduce the bug | 如何复现

docker compose -f compose.yaml --profile api up -d

### Operating System Mode | 操作系统类型

Linux

### Operating System Version| 操作系统版本

Distributor ID: Ubuntu
Description:    Ubuntu 24.04.3 LTS
Release:        24.04
Codename:       noble

Welcome to NVIDIA DGX Spark Version 7.2.3 (GNU/Linux 6.11.0-1014-nvidia aarch64)

### Python version | Python 版本

3.12

### Software version | 软件版本 (mineru --version)

`>=2.5`

### Backend name | 解析后端

vlm

### Device mode | 设备模式

cuda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvidia DGX Spark（GB10 BlackWell），docker部署，启动报错 #4350

🔎 Search before asking | 提交之前请先搜索

🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询

Description of the bug | 错误描述

How to reproduce the bug | 如何复现

Operating System Mode | 操作系统类型

Operating System Version| 操作系统版本

Python version | Python 版本

Software version | 软件版本 (mineru --version)

Backend name | 解析后端

Device mode | 设备模式

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nvidia DGX Spark（GB10 BlackWell），docker部署，启动报错 #4350

Description

🔎 Search before asking | 提交之前请先搜索

🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询

Description of the bug | 错误描述

How to reproduce the bug | 如何复现

Operating System Mode | 操作系统类型

Operating System Version| 操作系统版本

Python version | Python 版本

Software version | 软件版本 (mineru --version)

Backend name | 解析后端

Device mode | 设备模式

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions