Skip to content

Nvidia DGX Spark(GB10 BlackWell),docker部署,启动报错 #4350

@Essence9999

Description

@Essence9999

🔎 Search before asking | 提交之前请先搜索

  • I have searched the MinerU Readme and found no similar bug report.
  • I have searched the MinerU Issues and found no similar bug report.
  • I have searched the MinerU Discussions and found no similar bug report.

🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询

Description of the bug | 错误描述

2026-01-12 01:08:33.619710870 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2026-01-12 01:08:40.227 | WARNING | mineru.utils.pdf_page_id:get_end_page_id:8 - end_page_id is out of range, use images length
Start MinerU FastAPI Service: http://0.0.0.0:8000
API documentation: http://0.0.0.0:8000/docs
INFO 01-12 01:08:42 [init.py:216] Automatically detected platform cuda.
/usr/local/lib/python3.12/dist-packages/torch/cuda/init.py:283: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)

warnings.warn(
2026-01-12 01:08:45.383 | INFO | mineru.backend.vlm.utils:enable_custom_logits_processors:46 - compute_capability: 12.1 >= 8.0 and vllm version: 0.11.0 >= 0.10.1, enable custom_logits_processors
INFO 01-12 01:08:50 [model.py:547] Resolved architecture: Qwen2VLForConditionalGeneration
torch_dtype is deprecated! Use dtype instead!
INFO 01-12 01:08:50 [model.py:1510] Using max model len 16384
INFO 01-12 01:08:50 [scheduler.py:205] Chunked prefill is enabled with max_num_batched_tokens=5120.
WARNING 01-12 01:08:50 [init.py:3036] We must use the spawn multiprocessing start method. Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See https://docs.vllm.ai/en/latest/usage/troubleshooting.html#python-multiprocessing for more information. Reasons: CUDA is initialized
2026-01-12 01:08:51.185492900 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
INFO 01-12 01:08:52 [init.py:216] Automatically detected platform cuda.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:53 [core.py:644] Waiting for init message from front-end.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:53 [core.py:77] Initializing a V1 LLM engine (v0.11.0) with config: model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', speculative_config=None, tokenizer='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=16384, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B, enable_prefix_caching=True, chunked_prefill_enabled=True, pooler_config=None, compilation_config={"level":3,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":[],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output","vllm.mamba_mixer2","vllm.mamba_mixer","vllm.short_conv","vllm.linear_attention","vllm.plamo2_mamba_mixer","vllm.gdn_attention","vllm.sparse_attn_indexer"],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"cudagraph_mode":[2,1],"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"use_inductor_graph_partition":false,"pass_config":{},"max_capture_size":256,"local_cache_dir":null}
(EngineCore_DP0 pid=60) /usr/local/lib/python3.12/dist-packages/torch/cuda/init.py:283: UserWarning:
(EngineCore_DP0 pid=60) Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
(EngineCore_DP0 pid=60) Minimum and Maximum cuda capability supported by this version of PyTorch is
(EngineCore_DP0 pid=60) (8.0) - (12.0)
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) warnings.warn(
(EngineCore_DP0 pid=60) W0112 01:08:54.346000 60 torch/utils/cpp_extension.py:2425] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
(EngineCore_DP0 pid=60) W0112 01:08:54.346000 60 torch/utils/cpp_extension.py:2425] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
(EngineCore_DP0 pid=60) INFO 01-12 01:08:55 [parallel_state.py:1208] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
(EngineCore_DP0 pid=60) INFO 01-12 01:08:55 [topk_topp_sampler.py:55] Using FlashInfer for top-p & top-k sampling.
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [gpu_model_runner.py:2602] Starting to load model /root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B...
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [gpu_model_runner.py:2634] Loading model from scratch...
(EngineCore_DP0 pid=60) INFO 01-12 01:08:56 [cuda.py:366] Using Flash Attention backend on V1 engine.
Loading safetensors checkpoint shards: 0% Completed | 0/1 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 100% Completed | 1/1 [00:14<00:00, 14.42s/it]
Loading safetensors checkpoint shards: 100% Completed | 1/1 [00:14<00:00, 14.42s/it]
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [default_loader.py:267] Loading weights took 14.54 seconds
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [gpu_model_runner.py:2653] Model loading took 2.1637 GiB and 14.959749 seconds
(EngineCore_DP0 pid=60) INFO 01-12 01:09:11 [gpu_model_runner.py:3344] Encoder cache will be initialized with a budget of 14175 tokens, and profiled with 1 video items of the maximum feature size.
(EngineCore_DP0 pid=60) Process EngineCore_DP0:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] EngineCore failed to start.
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Traceback (most recent call last):
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 424, in make_cubin
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] subprocess.run(ptxas_cmd, check=True, close_fds=False, stderr=flog)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/lib/python3.12/subprocess.py", line 571, in run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] raise CalledProcessError(retcode, process.args,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] subprocess.CalledProcessError: Command '['/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas', '-lineinfo', '-v', '--gpu-name=sm_121a', '/tmp/tmpnt_7bh0a.ptx', '-o', '/tmp/tmpnt_7bh0a.ptx.o']' returned non-zero exit status 255.
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] During handling of the above exception, another exception occurred:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Traceback (most recent call last):
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 498, in init
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] super().init(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 92, in init
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] self._initialize_kv_caches(vllm_config)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 190, in _initialize_kv_caches
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] self.model_executor.determine_available_memory())
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 85, in determine_available_memory
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 83, in collective_rpc
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/utils/init.py", line 3122, in run_method
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] self.model_runner.profile_run()
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 3361, in profile_run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] self.model.get_multimodal_embeddings(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1462, in get_multimodal_embeddings
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] video_embeddings = self._process_video_input(video_input)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1412, in _process_video_input
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] video_embeds = self.visual(pixel_values_videos,
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 739, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] x = blk(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 489, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] x = x + self.attn(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in call_impl
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 384, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] qk_rotated = apply_rotary_pos_emb_vision(qk_concat, rotary_pos_emb)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 283, in apply_rotary_pos_emb_vision
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] output = apply_rotary_emb(t
, cos, sin).type_as(t)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 124, in apply_rotary_emb
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return ApplyRotaryEmb.apply(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py", line 576, in apply
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return super().apply(*args, **kwargs) # type: ignore[misc]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 50, in forward
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] out = apply_rotary(
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/ops/triton/rotary.py", line 203, in apply_rotary
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] rotary_kernel[grid](
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 390, in
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 594, in run
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] kernel = self.compile(src, target=target, options=options.dict)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 359, in compile
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] next_module = compile_ir(module, metadata)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 461, in
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] stages["cubin"] = lambda src, metadata: self.make_cubin(src, metadata, options, self.target.arch)
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 442, in make_cubin
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] raise PTXASError(f"{error}\n"
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] triton.runtime.errors.PTXASError: PTXAS error: Internal Triton PTX codegen error
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ptxas stderr:
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] ptxas fatal : Value 'sm_121a' is not defined for option 'gpu-name'
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708] Repro command: /usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas -lineinfo -v --gpu-name=sm_121a /tmp/tmpnt_7bh0a.ptx -o /tmp/tmpnt_7bh0a.ptx.o
(EngineCore_DP0 pid=60) ERROR 01-12 01:09:14 [core.py:708]
(EngineCore_DP0 pid=60) Traceback (most recent call last):
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 424, in make_cubin
(EngineCore_DP0 pid=60) subprocess.run(ptxas_cmd, check=True, close_fds=False, stderr=flog)
(EngineCore_DP0 pid=60) File "/usr/lib/python3.12/subprocess.py", line 571, in run
(EngineCore_DP0 pid=60) raise CalledProcessError(retcode, process.args,
(EngineCore_DP0 pid=60) subprocess.CalledProcessError: Command '['/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas', '-lineinfo', '-v', '--gpu-name=sm_121a', '/tmp/tmpnt_7bh0a.ptx', '-o', '/tmp/tmpnt_7bh0a.ptx.o']' returned non-zero exit status 255.
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) During handling of the above exception, another exception occurred:
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) Traceback (most recent call last):
(EngineCore_DP0 pid=60) File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore_DP0 pid=60) self.run()
(EngineCore_DP0 pid=60) File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=60) self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 712, in run_engine_core
(EngineCore_DP0 pid=60) raise e
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=60) engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 498, in init
(EngineCore_DP0 pid=60) super().init(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 92, in init
(EngineCore_DP0 pid=60) self._initialize_kv_caches(vllm_config)
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 190, in _initialize_kv_caches
(EngineCore_DP0 pid=60) self.model_executor.determine_available_memory())
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 85, in determine_available_memory
(EngineCore_DP0 pid=60) return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 83, in collective_rpc
(EngineCore_DP0 pid=60) return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/utils/init.py", line 3122, in run_method
(EngineCore_DP0 pid=60) return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context
(EngineCore_DP0 pid=60) return func(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory
(EngineCore_DP0 pid=60) self.model_runner.profile_run()
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 3361, in profile_run
(EngineCore_DP0 pid=60) self.model.get_multimodal_embeddings(
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1462, in get_multimodal_embeddings
(EngineCore_DP0 pid=60) video_embeddings = self._process_video_input(video_input)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1412, in _process_video_input
(EngineCore_DP0 pid=60) video_embeds = self.visual(pixel_values_videos,
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 739, in forward
(EngineCore_DP0 pid=60) x = blk(
(EngineCore_DP0 pid=60) ^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in _call_impl
(EngineCore_DP0 pid=60) return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 489, in forward
(EngineCore_DP0 pid=60) x = x + self.attn(
(EngineCore_DP0 pid=60) ^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
(EngineCore_DP0 pid=60) return self._call_impl(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1784, in call_impl
(EngineCore_DP0 pid=60) return forward_call(*args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 384, in forward
(EngineCore_DP0 pid=60) qk_rotated = apply_rotary_pos_emb_vision(qk_concat, rotary_pos_emb)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 283, in apply_rotary_pos_emb_vision
(EngineCore_DP0 pid=60) output = apply_rotary_emb(t
, cos, sin).type_as(t)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 124, in apply_rotary_emb
(EngineCore_DP0 pid=60) return ApplyRotaryEmb.apply(
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py", line 576, in apply
(EngineCore_DP0 pid=60) return super().apply(*args, **kwargs) # type: ignore[misc]
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/layers/rotary.py", line 50, in forward
(EngineCore_DP0 pid=60) out = apply_rotary(
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/vllm/vllm_flash_attn/ops/triton/rotary.py", line 203, in apply_rotary
(EngineCore_DP0 pid=60) rotary_kernel[grid](
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 390, in
(EngineCore_DP0 pid=60) return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 594, in run
(EngineCore_DP0 pid=60) kernel = self.compile(src, target=target, options=options.dict)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 359, in compile
(EngineCore_DP0 pid=60) next_module = compile_ir(module, metadata)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 461, in
(EngineCore_DP0 pid=60) stages["cubin"] = lambda src, metadata: self.make_cubin(src, metadata, options, self.target.arch)
(EngineCore_DP0 pid=60) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=60) File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/compiler.py", line 442, in make_cubin
(EngineCore_DP0 pid=60) raise PTXASError(f"{error}\n"
(EngineCore_DP0 pid=60) triton.runtime.errors.PTXASError: PTXAS error: Internal Triton PTX codegen error
(EngineCore_DP0 pid=60) ptxas stderr:
(EngineCore_DP0 pid=60) ptxas fatal : Value 'sm_121a' is not defined for option 'gpu-name'
(EngineCore_DP0 pid=60)
(EngineCore_DP0 pid=60) Repro command: /usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/bin/ptxas -lineinfo -v --gpu-name=sm_121a /tmp/tmpnt_7bh0a.ptx -o /tmp/tmpnt_7bh0a.ptx.o
(EngineCore_DP0 pid=60)
[rank0]:[W112 01:09:14.407408171 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2026-01-12 01:09:15.282 | ERROR | mineru.cli.fast_api:parse_pdf:332 - Engine core initialization failed. See root cause above. Failed core proc(s): {}
Traceback (most recent call last):

File "/usr/local/bin/mineru-api", line 7, in
sys.exit(main())
│ │ └
│ └
└ <module 'sys' (built-in)>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1462, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function Command.main at 0xf471780d4180>

File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1383, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0xf47178d15460>
│ └ <function Command.invoke at 0xf471780cfe20>

File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1246, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
│ │ │ │ └ <click.core.Context object at 0xf47178d15460>
│ │ │ └ <function main at 0xf46fa1f3dda0>
│ │ └
│ └ <function Context.invoke at 0xf471780cf060>
└ <click.core.Context object at 0xf47178d15460>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 814, in invoke
return callback(*args, **kwargs)
│ │ └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
│ └ ()
└ <function main at 0xf46fa1f3dda0>
File "/usr/local/lib/python3.12/dist-packages/click/decorators.py", line 34, in new_func
return f(get_current_context(), *args, **kwargs)
│ │ │ └ {'host': '0.0.0.0', 'port': 8000, 'reload': False}
│ │ └ ()
│ └ <function get_current_context at 0xf471780ad760>
└ <function main at 0xf46fa1f3df80>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/fast_api.py", line 362, in main
uvicorn.run(
│ └ <function run at 0xf471780e3380>
└ <module 'uvicorn' from '/usr/local/lib/python3.12/dist-packages/uvicorn/init.py'>
File "/usr/local/lib/python3.12/dist-packages/uvicorn/main.py", line 593, in run
server.run()
│ └ <function Server.run at 0xf47177f1ef20>
└ <uvicorn.server.Server object at 0xf46fa1ecdb50>
File "/usr/local/lib/python3.12/dist-packages/uvicorn/server.py", line 67, in run
return asyncio_run(self.serve(sockets=sockets), loop_factory=self.config.get_loop_factory())
│ │ │ │ │ │ └ <function Config.get_loop_factory at 0xf471780e2f20>
│ │ │ │ │ └ <uvicorn.config.Config object at 0xf470881d28d0>
│ │ │ │ └ <uvicorn.server.Server object at 0xf46fa1ecdb50>
│ │ │ └ None
│ │ └ <function Server.serve at 0xf47177f1efc0>
│ └ <uvicorn.server.Server object at 0xf46fa1ecdb50>
└ <function run at 0xf47178e9c680>
File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
│ │ └ <coroutine object Server.serve at 0xf46fa1f35ee0>
│ └ <function Runner.run at 0xf4717830ede0>
└ <asyncio.runners.Runner object at 0xf47088157800>
File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
│ │ │ └ <Task pending name='Task-1' coro=<Server.serve() running at /usr/local/lib/python3.12/dist-packages/uvicorn/server.py:71> wai...
│ │ └ <cyfunction Loop.run_until_complete at 0xf46fa1f4fc60>
│ └ <uvloop.Loop running=True closed=False debug=False>
└ <asyncio.runners.Runner object at 0xf47088157800>
File "/usr/local/lib/python3.12/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
└ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0xf46fa2093860>
File "/usr/local/lib/python3.12/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
│ │ │ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <fastapi.applications.FastAPI object at 0xf470880f7590>
└ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0xf46fa2093860>
File "/usr/local/lib/python3.12/dist-packages/fastapi/applications.py", line 1133, in call
await super().call(scope, receive, send)
│ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
│ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
└ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
File "/usr/local/lib/python3.12/dist-packages/starlette/applications.py", line 113, in call
await self.middleware_stack(scope, receive, send)
│ │ │ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1bf62...
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <starlette.middleware.errors.ServerErrorMiddleware object at 0xf46fa1bf5c40>
└ <fastapi.applications.FastAPI object at 0xf470880f7590>
File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
│ │ │ │ └ <function ServerErrorMiddleware.call.._send at 0xf46fa1c6b9c0>
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <starlette.middleware.gzip.GZipMiddleware object at 0xf46fa1bf4b00>
└ <starlette.middleware.errors.ServerErrorMiddleware object at 0xf46fa1bf5c40>
File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 29, in call
await responder(scope, receive, send)
│ │ │ └ <function ServerErrorMiddleware.call.._send at 0xf46fa1c6b9c0>
│ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
└ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 130, in call
await super().call(scope, receive, send)
│ │ └ <function ServerErrorMiddleware.call.._send at 0xf46fa1c6b9c0>
│ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
└ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/gzip.py", line 46, in call
await self.app(scope, receive, self.send_with_compression)
│ │ │ │ │ └ <function IdentityResponder.send_with_compression at 0xf47176f5a0c0>
│ │ │ │ └ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
INFO: 192.168.3.56:54656 - "POST /file_parse HTTP/1.1" 500 Internal Server Error
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0xf46fa1bf5c10>
└ <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>
File "/usr/local/lib/python3.12/dist-packages/starlette/middleware/exceptions.py", line 63, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
│ │ │ │ │ │ └ <bound method IdentityResponder.send_with_compression of <starlette.middleware.gzip.GZipResponder object at 0xf46fa1bf6270>>
│ │ │ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ │ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ │ │ └ <starlette.requests.Request object at 0xf46fa1eb83b0>
│ │ └ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
│ └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0xf46fa1bf5c10>
└ <function wrap_app_handling_exceptions at 0xf4717710c5e0>
File "/usr/local/lib/python3.12/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
│ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
└ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
File "/usr/local/lib/python3.12/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <fastapi.routing.APIRouter object at 0xf46fa1ecd460>
└ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0xf46fa1bf4dd0>
File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 716, in call
await self.middleware_stack(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <bound method Router.app of <fastapi.routing.APIRouter object at 0xf46fa1ecd460>>
└ <fastapi.routing.APIRouter object at 0xf46fa1ecd460>
File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <function Route.handle at 0xf4717710da80>
└ APIRoute(path='/file_parse', name='parse_pdf', methods=['POST'])
File "/usr/local/lib/python3.12/dist-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ └ <function request_response..app at 0xf46fa1f3db20>
└ APIRoute(path='/file_parse', name='parse_pdf', methods=['POST'])
File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 123, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
│ │ │ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bba0>
│ │ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ │ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
│ │ └ <starlette.requests.Request object at 0xf46fa1bf65d0>
│ └ <function request_response..app..app at 0xf46fa1c6bc40>
└ <function wrap_app_handling_exceptions at 0xf4717710c5e0>
File "/usr/local/lib/python3.12/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
│ │ │ └ <function wrap_app_handling_exceptions..wrapped_app..sender at 0xf46fa1c6bd80>
│ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.httptools_impl.RequestResponseCycle object at 0xf46fa1b...
│ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.18.0.2', 8000), 'c...
└ <function request_response..app..app at 0xf46fa1c6bc40>
File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 109, in app
response = await f(request)
│ └ <starlette.requests.Request object at 0xf46fa1bf65d0>
└ <function get_request_handler..app at 0xf46fa1f3dd00>
File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 387, in app
raw_response = await run_endpoint_function(
└ <function run_endpoint_function at 0xf4717710d580>
File "/usr/local/lib/python3.12/dist-packages/fastapi/routing.py", line 288, in run_endpoint_function
return await dependant.call(**values)
│ │ └ {'files': [UploadFile(filename='en规范.pdf', size=763496, headers=Headers({'content-disposition': 'form-data; name="files"; fil...
│ └ <function parse_pdf at 0xf46fa1f3d8a0>
└ Dependant(path_params=[], query_params=[], header_params=[], cookie_params=[], body_params=[ModelField(field_info=File(Pydant...

File "/usr/local/lib/python3.12/dist-packages/mineru/cli/fast_api.py", line 215, in parse_pdf
await aio_do_parse(
└ <function aio_do_parse at 0xf46fa1f3cd60>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 532, in aio_do_parse
await _async_process_vlm(
└ <function _async_process_vlm at 0xf46fa1f3c9a0>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 253, in _async_process_vlm
middle_json, infer_result = await aio_vlm_doc_analyze(
└ <function aio_doc_analyze at 0xf46fa1f3c680>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 230, in aio_doc_analyze
predictor = ModelSingleton().get_model(backend, model_path, server_url, **kwargs)
│ │ │ │ └ {'gpu_memory_utilization': 0.4}
│ │ │ └ None
│ │ └ None
│ └ 'vllm-async-engine'
└ <class 'mineru.backend.vlm.vlm_analyze.ModelSingleton'>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 125, in get_model
vllm_async_llm = AsyncLLM.from_engine_args(AsyncEngineArgs(**kwargs))
│ │ │ └ {'gpu_memory_utilization': 0.4, 'model': '/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', 'logits_pr...
│ │ └ <class 'vllm.engine.arg_utils.AsyncEngineArgs'>
│ └ <classmethod(<function AsyncLLM.from_engine_args at 0xf46f6cde7ce0>)>
└ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 235, in from_engine_args
return cls(
└ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 134, in init
self.engine_core = EngineCoreClient.make_async_mp_client(
│ │ └ <staticmethod(<function EngineCoreClient.make_async_mp_client at 0xf46f6cdf39c0>)>
│ └ <class 'vllm.v1.engine.core_client.EngineCoreClient'>
└ <vllm.v1.engine.async_llm.AsyncLLM object at 0xf46d50d19760>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 102, in make_async_mp_client
return AsyncMPClient(*client_args)
│ └ (VllmConfig(model_config=ModelConfig(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', runner='...
└ <class 'vllm.v1.engine.core_client.AsyncMPClient'>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 769, in init
super().init(
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 448, in init
with launch_core_engines(vllm_config, executor_class,
│ │ └ <class 'vllm.v1.executor.abstract.UniProcExecutor'>
│ └ VllmConfig(model_config=ModelConfig(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', runner='a...
└ <function launch_core_engines at 0xf46f6cdf13a0>
File "/usr/lib/python3.12/contextlib.py", line 144, in exit
next(self.gen)
│ └ <generator object launch_core_engines at 0xf46d50bd2510>
└ <contextlib._GeneratorContextManager object at 0xf46d4eb7b950>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 732, in launch_core_engines
wait_for_engine_startup(
└ <function wait_for_engine_startup at 0xf46f6cdf1440>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 785, in wait_for_engine_startup
raise RuntimeError("Engine core initialization failed. "

RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

How to reproduce the bug | 如何复现

docker compose -f compose.yaml --profile api up -d

Operating System Mode | 操作系统类型

Linux

Operating System Version| 操作系统版本

Distributor ID: Ubuntu
Description: Ubuntu 24.04.3 LTS
Release: 24.04
Codename: noble

Welcome to NVIDIA DGX Spark Version 7.2.3 (GNU/Linux 6.11.0-1014-nvidia aarch64)

Python version | Python 版本

3.12

Software version | 软件版本 (mineru --version)

>=2.5

Backend name | 解析后端

vlm

Device mode | 设备模式

cuda

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions