Skip to content

paddlepaddle-gpu cuda12.8版 , paddleX推理报错 #4268

Open
@monkeycc

Description

@monkeycc

win11
5090 D
conda python 3.11

python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu128/

python
Python 3.11.13 | packaged by conda-forge | (main, Jun  4 2025, 14:39:58) [MSC v.1943 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0626 11:47:59.276055 31136 pir_interpreter.cc:1490] New Executor is Running ...
W0626 11:47:59.277873 31136 gpu_resources.cc:106] The GPU compute capability in your current machine is 120, which is not supported by Paddle, it is recommended to install the corresponding wheel package according to the installation information on the official Paddle website.
W0626 11:47:59.277873 31136 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 12.0, Driver API Version: 12.9, Runtime API Version: 12.8
W0626 11:47:59.280525 31136 gpu_resources.cc:164] device: 0, cuDNN Version: 9.7.
I0626 11:47:59.281900 31136 pir_interpreter.cc:1513] pir interpreter is running by multi-thread mode ...
PaddlePaddle works well on 1 GPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
import paddle

# 检查GPU可用性
print("="*50)
if paddle.device.is_compiled_with_cuda() and paddle.device.cuda.device_count() > 0:
    device = "gpu:0"
    print(f"✅ 检测到可用GPU设备: {paddle.device.get_device()}")
else:
    device = "cpu"
    print("⚠️ 未检测到可用GPU设备,将使用CPU推理(速度较慢)")

✅ 检测到可用GPU设备: gpu:0

from paddlex import create_model

model = create_model(model_name="PicoDet-S")

output = model.predict(
  "xxx.png",
  device="gpu:0"
)

for res in output:
    res.print(json_format=False)
    res.save_to_img("./output/")
    res.save_to_json("./output/")
Using official model (PicoDet-S), the model files will be automatically downloaded and saved in C:\Users\monke\.paddlex\official_models.
Traceback (most recent call last):
  File "i:\2025_Code\Paddle_Code\1.py", line 11, in <module>
    for res in output:
  File "I:\AI\PaddleX\paddlex\model.py", line 61, in predict
    yield from self._predictor(*args, **kwargs)
  File "I:\AI\PaddleX\paddlex\inference\models\base\predictor\base_predictor.py", line 211, in __call__
    yield from self.apply(input, **kwargs)
  File "I:\AI\PaddleX\paddlex\inference\models\base\predictor\base_predictor.py", line 267, in apply
    prediction = self.process(batch_data, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\object_detection\predictor.py", line 234, in process
    batch_preds = self.infer(batch_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\common\static_infer.py", line 287, in __call__
    pred = self.infer(x)
           ^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\common\static_infer.py", line 252, in __call__
    self.predictor.run()
OSError: In user code:


    ExternalError: CUDA error(209), no kernel image is available for execution on the device.
      [Hint: 'cudaErrorNoKernelImageForDevice'. This indicates that there is no kernel image available that is suitable for the device. This can occur when a user specifiescode generation options for a particular CUDA source file that do not include the corresponding device configuration.] (at C:\home\workspace\Paddle\paddle\phi\kernels\gpu\multiclass_nms3_kernel.cu:617)
      [operator < pd_kernel.phi_kernel > error]
    InvalidArgumentError: Input tensor array size should > 0,but the received is 0
      [Hint: Expected n > 0, but received n:0 <= 0:0.] (at ..\paddle\phi\kernels\array_kernel.cc:98)
      [operator < pd_kernel.phi_kernel > error]
   for i, (img_path, res) in enumerate(zip(image_paths, outputs)):
  File "I:\AI\PaddleX\paddlex\model.py", line 61, in predict
    yield from self._predictor(*args, **kwargs)
  File "I:\AI\PaddleX\paddlex\inference\models\base\predictor\base_predictor.py", line 211, in __call__
    yield from self.apply(input, **kwargs)
  File "I:\AI\PaddleX\paddlex\inference\models\base\predictor\base_predictor.py", line 267, in apply
    prediction = self.process(batch_data, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\object_detection\predictor.py", line 234, in process
    batch_preds = self.infer(batch_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\common\static_infer.py", line 287, in __call__
    pred = self.infer(x)
           ^^^^^^^^^^^^^
  File "I:\AI\PaddleX\paddlex\inference\models\common\static_infer.py", line 252, in __call__
    self.predictor.run()
RuntimeError: In user code:


    PreconditionNotMetError: Tensor's dimension is out of bound.Tensor's dimension must be equal or less than the size of its memory.But received Tensor's dimension is 8, memory's size is 0.
      [Hint: Expected numel() * SizeOf(dtype()) <= memory_size(), but received numel() * SizeOf(dtype()):8 > memory_size():0.] (at ..\paddle\phi\core\dense_tensor_impl.cc:57)
      [operator < pd_op.while > error]

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions