Skip to content

GPU多线程使用时,程序卡住随后崩溃。 #2620

@Yamabukiss

Description

@Yamabukiss

温馨提示:根据社区不完全统计,按照模板提问,可以加快回复和解决问题的速度


环境

  • 【FastDeploy版本】: fastdeploy-win-x64-gpu-1.0.7
  • 【编译命令】使用下载的预编译库 fastdeploy-win-x64-gpu-1.0.7
  • 【系统平台】: Windows x64(Windows11)
  • 【硬件】: 说明具体硬件型号,如 Nvidia GPU 4070 Laptop, CUDA 11.6 CUDNN 8.2
  • 【编译语言】: C++

问题日志及出现问题的操作流程

参考官方示例multi_thread.cc,将代码修改至如下:

#include <thread>
#include <future>
#include "fastdeploy/vision.h"
#ifdef WIN32
const char sep = '\\';
#else
const char sep = '/';
#endif

// void Predict(fastdeploy::vision::detection::PPDetBase *model, int thread_id, const std::vector<std::string>& images) {
void Predict(fastdeploy::vision::detection::PPDetBase *model, int thread_id, const std::string& image_file) {
    // for (auto const &image_file : images) {
        auto im = cv::imread(image_file);

        fastdeploy::vision::DetectionResult res;
        if (!model->Predict(im, &res)) {
            std::cerr << "Failed to predict." << std::endl;
            return;
        }

        // print res
        std::cout << "Thread Id: " << thread_id << std::endl;
        // std::cout << res.Str() << std::endl;
    // }
}

void GetImageList(std::vector<std::vector<std::string>>* image_list, const std::string& image_file_path, int thread_num){
    std::vector<cv::String> images;
    cv::glob(image_file_path, images, false);
    // number of image files in images folder
    size_t count = images.size();
    size_t num = count / thread_num;
    for (int i = 0; i < thread_num; i++) {
        std::vector<std::string> temp_list;
        if (i == thread_num - 1) {
            for (size_t j = i*num; j < count; j++){
                temp_list.push_back(images[j]);
            }
        } else {
            for (size_t j = 0; j < num; j++){
                temp_list.push_back(images[i * num + j]);
            }
        }
        (*image_list)[i] = temp_list;
    }
}

void GpuInfer(const std::string& model_dir, const std::string& image_file_path, int thread_num) {
    auto model_file = model_dir + sep + "model.pdmodel";
    auto params_file = model_dir + sep + "model.pdiparams";
    auto config_file = model_dir + sep + "infer_cfg.yml";
    auto option = fastdeploy::RuntimeOption();
    option.UseGpu();
    option.UsePaddleBackend();
    auto model = fastdeploy::vision::detection::PPYOLOE(
        model_file, params_file, config_file, option);
    if (!model.Initialized()) {
        std::cerr << "Failed to initialize." << std::endl;
        return;
    }

    std::vector<decltype(model.Clone())> models;
    for (int i = 0; i < thread_num; ++i) {
        models.emplace_back(model.Clone());
    }

    std::vector<std::vector<std::string>> image_list(thread_num);
    GetImageList(&image_list, image_file_path, thread_num);

    std::thread t1(Predict, models[0].get(), 0, R"(E:\project_codes\paddle_test\release\test1.jpg)");
    std::thread t2(Predict, models[1].get(), 1, R"(E:\project_codes\paddle_test\release\test2.jpg)");
    // std::thread t3(Predict, models[2].get(), 2, R"(E:\project_codes\paddle_test\release\test3.jpg)");
    // std::thread t4(Predict, models[3].get(), 3, R"(E:\project_codes\paddle_test\release\test4.jpg)");

    t1.join();
    t2.join();
    // t3.join();
    // t4.join();

    // auto ret1 = std::async(std::launch::async, Predict, models[0].get(), 0, R"(E:\project_codes\paddle_test\release\test.jpg)");
    // auto ret2 = std::async(std::launch::async, Predict, models[1].get(), 1, R"(E:\project_codes\paddle_test\release\test.jpg)");

    // ret1.get();
    // ret2.get();

    // std::vector<std::thread> threads;
    // for (int i = 0; i < thread_num; ++i) {
    //     threads.emplace_back(Predict, models[i].get(), i, image_list[i]);
    // }

    // for (int i = 0; i < thread_num; ++i) {
    //     threads[i].join();
    // }
}

int main(int argc, char **argv) {
    GpuInfer(R"(E:\project_codes\paddle_test\release\ppyoloe_plus_crn_m_80e_coco)", R"(E:\project_codes\paddle_test\release\test1.jpg)" , 2);

    return 0;
}

控制台信息如下:

17:11:56: Starting E:\project_codes\paddle_test\release\paddle_test.exe...
[INFO] fastdeploy/vision/common/processors/transform.cc(45)::fastdeploy::vision::FuseNormalizeCast Normalize and Cast are fused to Normalize in preprocessing pipeline.
[INFO] fastdeploy/vision/common/processors/transform.cc(93)::fastdeploy::vision::FuseNormalizeHWC2CHW Normalize and HWC2CHW are fused to NormalizeAndPermute in preprocessing pipeline.
[INFO] fastdeploy/vision/common/processors/transform.cc(159)::fastdeploy::vision::FuseNormalizeColorConvert BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] fastdeploy/runtime/runtime.cc(273)::fastdeploy::Runtime::CreatePaddleBackend Runtime initialized with Backend::PDINFER in Device::GPU.
[INFO] fastdeploy/runtime/runtime.cc(384)::fastdeploy::Runtime::Clone Runtime Clone with Backend:: Backend::PDINFER in Device::GPU.
[INFO] fastdeploy/runtime/runtime.cc(384)::fastdeploy::Runtime::Clone Runtime Clone with Backend:: Backend::PDINFER in Device::GPU.
17:13:23: E:\project_codes\paddle_test\release\paddle_test.exe 崩溃。

我是将官方示例的模型改为了PPYOLOE,然后作了一些相应的修改,当代码不是:

 std::thread t1(Predict, models[0].get(), 0, R"(E:\project_codes\paddle_test\release\test1.jpg)");
 std::thread t2(Predict, models[1].get(), 1, R"(E:\project_codes\paddle_test\release\test2.jpg)");
 // std::thread t3(Predict, models[2].get(), 2, R"(E:\project_codes\paddle_test\release\test3.jpg)");
 // std::thread t4(Predict, models[3].get(), 3, R"(E:\project_codes\paddle_test\release\test4.jpg)");

 t1.join();
 t2.join();

而是仅开启一个线程时是可以正常运行的:

 std::thread t1(Predict, models[0].get(), 0, R"(E:\project_codes\paddle_test\release\test1.jpg)");
 // std::thread t2(Predict, models[1].get(), 1, R"(E:\project_codes\paddle_test\release\test2.jpg)");
 // std::thread t3(Predict, models[2].get(), 2, R"(E:\project_codes\paddle_test\release\test3.jpg)");
 // std::thread t4(Predict, models[3].get(), 3, R"(E:\project_codes\paddle_test\release\test4.jpg)");

 t1.join();
 // t2.join();

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions